Hi Anastasia,
Thank you for taking time to review.
Potential problems with this approach are:
- The maintenance costs as it's harder to modify Clang than just
change the header in case we want to extend the list of builtins or
modify them. Meaning that you have to add logic to clang to handle
what is essentially just a list of declarations. Clang builtins
weren't really intended to solve this problem.
It is definitely harder to modify Clang vs a change in a header file,
although it is expected that all difficult code is shared between all
builtins, making it easy to add a new builtin.
- It is more risky in terms of bugs because it might not be easy to
cover all use cases.
This might be true, considering a custom typecheck code that we have to
write to handle implicit conversions (which comes for free if we use
headers).
On the other hand, you'll need to write this typecheck code once, and
all other OpenCL builtins can re-use it. Surely some bugs may happen,
but the problem is localized and it is easy to test.
- Affects parsing time to check all the extra switch cases.
Switch is generally faster than any other lookup, so I assume that there
will be no performance implications after adding more switch-cases.
It also makes sense to put these switch-cases under `if (Lang.OpenCL)',
so it will have a minimal impact on other languages.
- We will need to replace the builtins from the header with macros
aliasing Clang builtins. I find macros really bad in terms of error
reporting.
Agree. Any diagnostic related to builtins will be accompanied by "note:
expanded from macro 'convert_char'". This makes error messages a bit
harder to read.
I wonder if there is a way to have a user-facing alias for a builtin
function without using a preprocessor.
But also they [macros] will still need to be parsed.
Right, but the number of macros should not be high: we need only one
macro to handle all overloadings. I expect ~600 macros for different
`convert_xxx' builtins, ~200 macros for math builtins, ~50 macros for
vload/vstore, etc.
In essence, there should be ~1000 macro declarations vs existing ~15000
function declarations. This has to be measured of course, but I expect a
significant improvement in parsing in comparison with the existing
(header) design.
Also something important to consider is the impact on parsing for
other languages, i.e. C/C++. I think we should evaluate this
carefully. Do you have any numbers for this already?
From my understanding, there shouldn't be any impact on parsing other
languages: __builtin_opencl_* are LANGBUILTIN, and they should only be
parsed for OpenCL.
My team has a slightly different solution to this problem. It is based
on generation of tries. Something similar to attribute parsing, see
generated AttrParsedAttrKinds.inc. Unfortunately it's not in a shape
that we can currently share. But it should not take longer than 2-3
weeks to upload a prototype if there is interest.
Having an alternative to consider is always better. This can help us to
get a solution that fits for everyone.
To give a rough idea, it uses a TableGen mechanism to describe the
OpenCL builtin functions (similar to intrinsics in LLVM). During the
Clang build phase, a trie file is generated. There is a hook in Clang
that uses the generated trie to lookup the builtin function name
during call resolution in Sema. After a successful lookup it generates
the AST declaration of a function with overloads and the rest just
follows the unmodified Clang flow (i.e. checking overloading,
mangling, etc). This will cover all builtins, not just conversions and
maths. It is easy to extend. It doesn't need to change much of Clang
code, as it allows to follow normal compilation flow.
So when Sema reaches a call to, say, `acos', and cannot resolve it to
any existing declaration, the hook creates all `acos' function
declarations for all combination of types:
float acos(float);
float2 acos(float2);
...
// same for other vector types, double and half
Then Clang's standard machinery chooses which overloading to use and
performs all necessary conversions.
Do I understand this correctly?
Do you think it would make sense to evaluate this solution as well?
Absolutely. Although, there are several drawbacks/things to consider:
1) This approach introduces another type of 'builtin function' to
Clang: it is built-in into Clang, but it looks and behaves exactly
as a user-defined function (except that it doesn't have a
SourceLocation).
This is something new for Clang (unless I miss something), so we
probably need more people from Clang community to review this.
2) If OpenCL builtin looks and behaves as a user-defined function, it
will always be mangled the same way as it is mangled now (with
headers).
We cannot emit LLVM intrinsics instead of mangled functions, and,
as I mentioned, LLVM intrinsics for builtins can make SPIR-V
translation easier. Again, see [1].
3) If my assumption is correct, you'll be generating dozens of
function declarations (overloadings for all types) for every
builtin that is used in a program. This might not be optimal in
terms of performance, but it'll probably be good enough for general
use-case.
Anyway, this approach looks pretty interesting.
[1]: https://lists.llvm.org/pipermail/llvm-dev/2018-September/125977.html