Adopting tailcc as the calling convention of choice for “performance tweakers” like myself sounds great, as long as it’s possible to port over the innovations of preserve_none that I mentioned above.
That would make tailcc the natural choice for chains of tail calls. It would have both better performance and better predictability than other options.
Does tailcc have existing stakeholders that we’d need to get sign-off from, or could we start sending PRs? Hopefully nobody is relying on ABI stability of tailcc. Who are the primary users of tailcc today?
Improving the backend diagnostics for failed tail calls is a great idea, I strongly support that.
I Know this is a question to LLVM/clang users but I didn’t see anyone ask the same question to the GCC folks. So I will provide the GCC experience here; I helped reduce the testcases for all of these issues.
For GCC 15, clang::musttail and gnu::musttailattribute was added. There was a lot of last minute (within 2 months of the release) fixes (to GCC 15.1.0) needed to support the same use cases as they were being used with clang/LLVM. Though thankfuly none of them seemly to cause any issues which is good.
The biggest issue is how GCC implements the attribute compared to how LLVM implements it. GCC allows slightly musttail on many more returns; this was not the cause of the issues though. In GCC case the problem is things like -fprofile-generate(and -fsanitize=address) would put code after the call which caused the tail call generation to fail (and produce an error [but not crash the compiler]).
Also GCC originally didn’t implement the requirement to allow still of local variables escaping for musttail. GCC now warns for that case but allows it. This I would say is the biggest issue that I can see most folks misusing with musttail. (but it is no difference than going out of scope earlier).
And the other big issue that showed up with musttail was GCC’s IPA passes which would remove the return type because it was known to be a constant. This was worked around by having the IPA passes not change the return type or allow for the return value to be proped into a musttail call.
Passing aggregates with a recusive call (with musttail) was not something which GCC didn’t support before but musttail actually forced this to be fixed. So this was something which forced GCC to improve.
Aggregate returns was/is not well as tested and turns out to have issues still even on the GCC trunk. ( Making sure you're not a bot! ). This didn’t show up in the distro testing (protobuf and cpython) and was found with a wasm to C produced code
Also most of the testing was with C++ code; it turns out GCC’s C front-end support for musttail was much less tested even and the C front-end has a bug where if you use the attribute once in a scope and then do another return in the same scope, it would assume the second one still had a musttail on it ( Making sure you're not a bot! ). This is just a bug/oversight which will be fixed but I was trying to provide the full update on how musttail usage is going.
One of the benefit of the way GCC implemented musttail over LLVM is that it will not cause an internal compile error if the musttail fails. But the downside is that GCC might reject musttail in some cases where LLVM would accept it. Also GCC’s error message on why rejecting the musttail call is not the best in many cases (there are a few bug reports filed about improving them). GCC’s optimizations sometimes gets in the way with musttail. In other cases (e.g. sparc backend), -O0rejects all tail calls (needs delay slow optimization turned on; I have not looked into why though). Arm thumb1 also rejects all tail calls too. PowerPC64 rejects some tail calls too (the same issue in LLVM; toc). RISCV and s390 reject some Aggregate arguments too (s390 issue was worked around in GCC; riscv not so much; riscv issue also happens with LLVM I think it is reported).
Edit: correct spelling.
4 Likes
We recently started to use clang::musttail in clang itself, here a few notes:
__has_cpp_attribute(clang::musttail) always returns true, even if the backend then decides that tailcalls are never supported. Which means tailcalls need to be disabled on a case-by-case basis because we can’t and aren’t testing for every host compiler+target arch combination there is.
- On MSVC, tailcalls aren’t guaranteed to happen in debug mode, so the implementation can’t rely on them in that case.
- Depending on the host compiler, sanitizers are also a problem.
- Like python, I tried to use the
nopreserve calling convention, which also needs to be disabled in various circumstances. Again, the frontend claims the attribute is supported but then the target can disagree and the frontend won’t know. There are also known bugs on aarch64, and other problems with older gcc and clang versions, even on x86_64.
It’s pretty cool if it works but overall seems very brittle.
1 Like
GCC tries to handle this case at -O0 but I Know it is very brittle.
Depending on the host compiler, sanitizers are also a problem.
I am curious to know more, I know there was some issues with GCC 15 with respect to sanitizers and musttail but I thought they were all fixed before the release of 15.1.0.
Or was this with a LLVM host compiler?
For the GCC issues I would be interested in them; at least filed.
gcc 15 was exactly the problem in [clang][bytecode] Reapply "Use tailcalls via `[[clang::musttail]]`" by tbaederr · Pull Request #188419 · llvm/llvm-project · GitHub, no sanitizers involved AFAICS.
clang19+asan+x86_64 is what I’m looking at right now, so clang is definitely also a problem. I can’t come up with more gcc configurations right now, I wasn’t paying that much attention to the host compiler to be honest.
1 Like
Oh I see LLVM does this via an indirect jump to get around having to save/restore ebp while GCC just rejects it. And LLVM does not do tail calls otherwise even to local functions.
Filed Making sure you're not a bot! for how for GCC to handle this case.