What's the principle to add builtins in clang?

Hi all,

Background:
Recently I am trying to enable the Coroutine Heap Elision in some code bases. Here is the introduction for Coroutine Heap Elision: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0981r0.html.
Then in LLVM, we decide to elide one coroutine if we can prove coro.id, which marks a coroutine, passes coro.destroy for every path to exit entry of the current function.
For example (uses if-else for simplicity):

coro foo() {
%handle = call i8* @llvm.coro.id(...)
; some works
if (...)
call void @llvm.coro.destroy(%handle)
else {
other works
call void @llvm.coro.destroy(%handle)
}
}

And it would be elided.
And if:


coro foo() {
%handle = call i8* @llvm.coro.id(...)
; some works
if (%handle)
call void @llvm.coro.destroy(%handle)
}

It wouldn’t be elided. And I want to add a builtin to makr the corresponding coroutine is already dead. Let me call it __builtin_coro_dead. Then we can write:


coro foo() {
%handle = call i8* @llvm.coro.id(...)
; some works
if (%handle)
call void @llvm.coro.destroy(%handle)

call void @__builtin_coro_dead(%handle)

}

And it would be elided now.

Question:
The described above is just a background. This thread doesn’t aim to ask for whether it is good to use __builtin_coro_dead to solve the problems.
We could discuss it in another thread. Here my question is what’s the principle to judge whether should we to add new builtins. Since the end users could
touch builtins while I can’t search builtin in the Standard of C++ (N4878). So if we could add new builtins arbitrarily, it means the compiler writers could
change the language without changing the standard document, which is very very odd for me. I can’t find related rules. So here to ask for your suggestion.

Thanks,
Chuanqi

I'm not really following this bit about changing the language without
changing the standard document, or what builtins have to do with the
C++ standard - could you explain this in more/different words,
perhaps?

In general, builtins are a compiler implementation detail (nothing to
do with the C++ standard) and adding them is a tradeoff like adding
new instructions to LLVM IR (though builtins are lower cost than
instructions, generally - they're easier to add and remove/aren't such
a fundamental part of the IR): Does the new builtin or instruction
pull its weight: Adding new features to the IR in either case comes at
a cost of implementation complexity (now optimization passes need to
know about these new features) and if the semantics can be expressed
reasonably cleanly with existing IR features, that's preferable (or if
the IR feature can be generalized in some way to maximize the value
(make it more usable for a variety of problems people are having
trouble solving without it) while minimizing the cost (if it
generalizes well to something that is easy/reasonable for IR consumers
to handle/matches concepts they're already modeling/etc))

Hi,

I’m not really following this bit about changing the language without
changing the standard document, or what builtins have to do with the
C++ standard - could you explain this in more/different words,
perhaps?

I mean the C++ users could use builtins in their source codes, Although this is not recommended.
In fact, in some projects which need to change the compiler move from GCC to Clang, I find some uses for some builtins.
My point is, although builtins are not part of the language standard, people could use the builtins in their code actually.
In other words, if the compiler adds new builtins, the actually semantic space would be larger than the design space.
That’s what I said, we change the language in fact without changing the language standard documentation.

Thanks,
Chuanqi

Hi,

> I'm not really following this bit about changing the language without
> changing the standard document, or what builtins have to do with the
> C++ standard - could you explain this in more/different words,
> perhaps?

I mean the C++ users could use builtins in their source codes, Although this is not recommended.

Ah, I think LLVM builtins aren't available to C++ source code by
default - we wrap them in C intrinsics when that's desirable, for
instance.

Ah, I think LLVM builtins aren’t available to C++ source code by
default - we wrap them in C intrinsics when that’s desirable, for
instance.

Hmm, I tried to implement a new clang builtins. Then I could use
the new builtin in C++ source code directly. And the library implementation
also use the clang builtins directly without including or declaring.

I guess the gap is that I mean clang builtins instead of LLVM intrinsics.

> Ah, I think LLVM builtins aren't available to C++ source code by
> default - we wrap them in C intrinsics when that's desirable, for
> instance.

Hmm, I tried to implement a new **clang** builtins. Then I could use
the new builtin in C++ source code directly. And the library implementation
also use the clang builtins directly without including or declaring.

I guess the gap is that I mean **clang** builtins instead of **LLVM** intrinsics.

Ah, sure - your initial email example/code snippets looked like LLVM
intrinsics to me (the @ and the . in names, etc). (& I might've
muddled up the intrinsics/builtins terminology)

- Dave

Ah, sure - your initial email example/code snippets looked like LLVM
intrinsics to me (the @ and the . in names, etc). (& I might’ve
muddled up the intrinsics/builtins terminology)

Sorry for confusing. My intuition was to use pseudo IR (including if for simplicity).

Thanks,
Chuanqi