Thanks a lot for your questions. David has already provided very good
answers. I have added my views and answers inline.
I work on WebAssembly, and I was hoping we would eventually use LLVM
libc for end-to-end Wasm toolchain. I have some questions about "ground
truth" approach to libc API. I am sorry if those have been asked, could
not find the answers looking through mailing list messages and code reviews.
I now have a patch out for review: https://reviews.llvm.org/D70197
The patch shows the up to date header generation scheme.
I was wondering what does API generation buy for the developers and
For the users, the benefit is probably negligible/minimal: the header
files they include will have
much less macro and #ifdef clutter.
Developers can be of various kinds, so let me try to list the benefits
for two kinds of developers I can think of:
1. For developers working on LLVM-libc: Clear cut separation of
standards, platform configs and implementation makes adding new API,
implementation or a platform config (like the "config/linux/api.td"
file in the above patch) a straightforward task.
2. For developers putting together a libc for their platform: Instead
of adding inclusion and exclusion macros to header files, they merely
write a config for their platform, like the "config/linux/api.td" file
in the above patch.
Maybe the question is how did previous implementations of libc
get away without generating headers, but also is API generation a
reasonable and foolproof solution.
I have had this question myself and tried asking around to get
answers. Unfortunately, I did not get a good answer yet.
Most importantly, the motivation seems to be that there are a few
potential standards a libc implementation needs to comply with. But how
many substantially different APIs are there realistically? If it is in
lower single digits, does this really make it worth the effort?
Yes, I agree that the number of standards we have to support will be
fairly small. But, there will be a much larger number of configs that
we will have to cater to. That is, there will be a large number of
platforms which will want to pick and choose from the small numbers of
standard we support. Header generation makes this possible without
using hard to debug/maintain #ifdefs in the header files. Note that I
used the word "cater" and not "support" because I do not think we want
to support all of the configs upstream. A lot of these configs will be
maintained downstream and the proposed header generation scheme makes
it straightforward to maintain them downstream.
Secondly, libc API is not only types and function prototypes, it
typically includes depends on "feature test macros". I am not sure it is
possible to gracefully support those in a generated API. Encoding test
macros in API "ground truth" rules would make API rules as complex as C
macro code they are trying to replace. Leaving test macros up to the C
header files would result in a mix of preprocessor and rule logic which
would probably be more confusing than going all the way in either
(preprocessor or generation) direction.
If you look at the patch I have pointed you to, the ground truth files
are devoid of any test macros. They are merely a listing of what the
standards prescribe. Also, as David pointed out, there is scope to
extend them with platform independent annotations. We do not yet know
what these annotations are going to look like, but the current set up
keeps that door open if and when we are ready for them.
About feature test macros, the current setup (in the above patch) does
not include them. However, I am of the opinion that we can certainly
include them. For example, if the ground truth file for a standard (in
the "spec" directory) can specify the feature macro for that standard,
then we can enclose the generated API from that standard within the
corresponding feature macro. This would work best if we have some
baseline over which extensions are enabled optionally based on the
corresponding feature macro being defined.
Finally, somewhat rhetorical point on precedent and expertise. There is
enough precedent for a portable libc API written directly; likewise
C/C++ developers can understand and modify C headers without ramp-up -
not sure that can be said about tablegen. Writing header files is a
relatively simple part of the development process and there is a lot of
it happening inside and outside of LLVM.
True that TableGen is an unfamiliar format. But if you consider glibc
for example, sure you can edit the header files directly to add/change
the API. But, you will have to then add a conformance test in a data
file which is not a normal C header file. On the other hand, to
add/change the API in LLVM libc, one only needs to edit the
corresponding ground truth file written in tablegen format; The header
file is tool generated. So, I think that the developer burden in
LLVM-libc is actually much less: 1. One does not have to worry about
editing header files and tripping over the macros and #ifdefs in them,
2. It eliminates the need for conformance tests.