How are Header files generated from .def files?

I am trying to understand how builtins are declared in clang, how header files for those builtins are generated, and how they are mapped onto the llvm-ir intrinsic functions. I am using RISCV vector builtins as a case study. So far I’ve been able to understand that tablegen generates riscv_vector_intrinsics.inc file from riscv_vector.td file. Then, in BuiltinsRISCVVector.def file, the riscv_vector_intrinsics.inc is included.
Please help me understand how is riscv_vector.h file generated from BuiltinsRISCVVector.def file?

1 Like

Looks like the header isn’t generated from the def, but from the original tablegen file: llvm-project/CMakeLists.txt at a5e1a93ea10ffc06a32df6d9f410e9b297ed136d · llvm/llvm-project · GitHub

1 Like

Thanks for the reply. What is the purpose of .def files then? And what about those builtins that are not declared using .td files? e.g: BuiltinsAArch64.def

About builtins specifically I am not sure.

In general, a def file is a bunch of macro invocations. Then you do:

#define ONE_OF_THE_MACRO_NAMES ...
#include "thefile.def"

And you can choose how those invocations act. For example in the AArch64 target parser we used to emit switch cases and structs based on the same data, by defining the macro differently before each include.

So a def file is just a way to remove some boilerplate when you need that flexibility.

Once things get complicated tablegen or straight C++ is more useful. In the case of riscv_vector.h I assume we know exactly the format we want, so there’s no reason to add a layer of macros in there.

It is very possible that the AArch64 builtins are not declared using td files because it just didn’t occur to us at the time (especially if there is a small set of builtins, I assume a vector extension has a lot to deal with).

I don’t think there’s any reason you couldn’t use either (or both, as riscv has done) method.

So in case of riscv_vector.td, devs must have added corresponding custom code for tablegen backend too?

Exactly, there is a tablegen backend for it: llvm-project/RISCVVEmitter.cpp at a5e1a93ea10ffc06a32df6d9f410e9b297ed136d · llvm/llvm-project · GitHub

And if you want to know how tablegen backends work overall (shameless plug) I wrote a tutorial about that: llvm-project/sql_query_backend.ipynb at main · llvm/llvm-project · GitHub

It is in Python but is based on pre-existing C++ backend that is linked at the start of the guide.

1 Like