Help working around an LLVM 14 regression: __muloti4 is getting optimized to call itself!

Hi there. I just filed this issue: LLVM 14 regression: __muloti4 is lowered with a recursive call despite nobuiltin attribute

This is a head scratcher and it’s currently blocking me from upgrading Zig from LLVM 13 to 14.

I was hoping for some brainstorming here about how to work around this issue. Any ideas?

nobuiltin in particular is roughly equivalent to clang’s -fno-builtin: it implies that you’re building a “freestanding” binary, so we don’t assume you have a C library. We still do assume that you have memcpy, memmove, memset, and the symbols exposed by compiler-rt.builtins. In general, generating code inline for those symbols is some combination of hard to do with our current infrastructure, or would massively bloat the generated code. So there’s no flag to disable usage of compiler-rt.builtins symbols.

As you’ve discovered, this can lead to issues if you’re trying to write an implementation of one of those functions using LLVM itself. There’s currently no unified solution here; we add hacks as specific situations come up.

In your particular case, you can probably use something equivalent to the C asm("":"+r"(x)); as an optimization barrier.

We could also look into restricting the specific optimization that’s generating smul.with.overflow.

As a sort of more general solution, I’ve been thinking about making the compiler generate definitions for the compiler-rt symbols, instead of depending on a separate library. But I don’t have an implementation of that, and I probably won’t be working on it anytime soon.