Target ABI: improve call parameters extensions handling

Posting an RFC here to invite more people to share their opinion on how to best deal with the problem of missing extension attributes on call arguments.

Please find the previous discussion and proposed patch here: target ABI: improve call parameters extensions handling by JonPsson1 · Pull Request #100757 · llvm/llvm-project · GitHub.

In short: while only a few targets have ABIs that require extension of narrow integer arguments anyone working on a front-end or an instrumentation pass could generate calls and may forget that the extensions are necessary for some users. Currently there is no way of catching these cases as there are legal cases, i.e. “struct-in-reg”. My idea is to introduce a NoExt attribute which are used for “struct-in-reg” in order to be able to always require an explicit extension type.

We have a lot of targets that use zeroext or signext for narrow integers, but iirc this only seems to be a problem in practice for s390x. Could you please remind me what the relevant difference is?

It’s mostly an issue on targets where “int” and “unsigned int” require some kind of extension. See llvm-project/llvm/include/llvm/Analysis/TargetLibraryInfo.h at 4e078e3797098daa40d254447c499bcf61415308 · llvm/llvm-project · GitHub .

Ok, there are more than just a few, but the point is that it is easy for someone working on a project on a target that that doesn’t care about extensions to forget about it entirely - even if just emitting a libcall somewhere.

There isn’t anything special about this on s390x, but there have been some issues which were tracked down and found to be the result of missing argument extensions which would be nice to prevent from happening again.

That comment doesn’t explain why though

If you’re trying to pass a value smaller than int, almost all targets require some kind of extension, so it’s hard to forget. If you’re trying to pass a value larger than int, it never requires extension. So the most likely mistake is incorrectly passing “int” or “unsigned int”, and that doesn’t affect the most popular targets.

After the work with the libcalls and sanitizers where those extensions needed to be added after the fact, I thought it was pretty clear that this does get overseen sometimes. And IIRC, X86 actually doesn’t have this requirement…? So any new (X86) LLVM developer might not have ever thought about these “weird” extension requirements.

Maybe “easy to forget” is a slight overstatement, but I can just imagine personally that if I was focusing hard on something and maybe then in one case needed to emit a libcall or any type of call, adding the right sign/zero extension on the arguments would be one of those things to remember. Especially if that was not part of my targets ABI it would be even easier to forget.

The other little detail here is that the target cannot be conservative and always do the right extension even if there is no attribute - it can’t know if it is supposed to be sign or zero extended. Only the front-end / call emitter knows this.

I think the goal here is reasonable, and the general approach of using a noext attribute is fine as well.

What we need to be careful about is that this does not introduce an undue burden for the vast majority of tests and code, which does not deal with the introduction of new libcalls. I think a hard requirement here is that we will not require any substantial number of noext annotations in tests. A corollary is that this should never become an IR verifier check.

To that effect, I’d have two questions about the approach in your patch:

  1. Is it possible to limit this only to verification of calls to declarations. In particular, exclude calls to definitions, and exclude returns in definitions. This would be sufficient to cover emission of libcalls with incorrect ABI attributes – are there more cases that are problematic in practice?
  2. Rather than making this enabled by default, would it make sense to disable it by default, and instead have clang enable a TargetMachine flag in assertion-enabled builds? That way things will crash and burn in end-to-end testing, but we do not impose any burden on lit tests.

If we go with the latter, then the former is not so important.

That’s a very good pint, as all the tests that basically ignore this and don’t pass this check is quite a long list.

#1: We have already agreed previously that we don’t need to check anything relating to local functions, as they are not visible to other compilation units. We do however have to make sure that any externally visible definitions have the right attributes and adhere to the ABI.

#2: My thoughts so far has been to keep the actual check in the SystemZ backend (as suggested by @efriedma-quic), which means that only SystemZ and Generic CodeGen tests need to be fixed. I think maybe having an llc option like -no-abi-args-check to use with llc tests might be an option, instead of going though them all and adding attributes on tests where they don’t matter (even though that would be ideal). I am not quite sure how your idea of having a target flag for this would work even though it sounds interesting. Do you mean that this check would take effect as a Function(?) attribute emitted by clang, which would then make all llc tests immune (except if that was added)? Interesting, but there’s a slight advantage to having a pure back-end check namely that Front-End developers wouldn’t have to remember to emit that attribute even. But then again, maybe that could be documented somewhere in the right place to help them out. A Function Attribute like “arg-abi-compliant” or something might be a nice alternative here - didn’t think about that before :slight_smile:

Another thing is that IIRC, there was some discussion (involving @aeubanks and maybe you?) some year ago looking into what would be best: to demand correctness in the function declaration, or in the call instruction, instead of kind of relying on both. Don’t remember the details or what came out of it. The patch is still assuming that these attributes could come from either the declaration or call instruction (but not necessarily both)…

What I actually had in mind is a flag in TargetMachine (or similar), which enables or disables the check globally, not a per-function attribute. Maybe that flag should even be enabled by default (so it automatically covers not just clang, but e.g. also lld) and only be disabled in llc by default.

My thinking here is that there is actually fairly little practical value to this check for lit tests: This is because nearly all places that introduce libcalls will either use target-independent testing, or tests using a target other than s390x. So those tests wouldn’t trigger the verification anyway. In practice, you’d only hit it with actual end-to-end tests on s390x (where “end-to-end tests” are not necessarily part of LLVM, just any kind of clang/lld usage). So we can limit the check to the end-to-end case in the first place, which should give the best signal to noise ratio.

Sounds like a great idea to me if we could enable this by default on a given target, but also disable it on specific tools, like llc, lli, … Agree that what we really need is the end-to-end assertions and not so much rewriting all the lit tests. I am however not aware that this is fundamentally possible - is there any Target class that knows what tool is being built?

Target classes don’t know which tool uses them, but we can still pass down options. E.g. we could set a TargetOptions flag inside llc somewhere around here: llvm-project/llvm/tools/llc/llc.cpp at 13865b09c6b0c767a8c3d4f49662e6c503260976 · llvm/llvm-project · GitHub

Thanks for this suggestion to use a TargetOptions flag to disable this for llc! I have updated the patch…

As discussed earlier, the verification stage is now only in effect in SystemZ LowerCall() and LowerReturn() implementations. Not sure if FastISel / GlobalISel would be covered or if other paths are still unchecked, but I’d say this is a good start. Maybe at some future point there could be a verfier at the I/R level just before (any) isel.

Initial support has now been committed with 1412022.