[RFC] Ground truth for public API

Hello all,

In the source tree, we have different items like header files,
implementation .cpp file etc, which have to conform to the public API
specified by various standards. Currently, the header files and
implementation files are prepared in a disjointed fashion. That is, we
write them manually and separately. There is no check to verify that
these separate entities are coherent and that they are conforming to
the standard. So, I propose fixing this using a ground truth file.
Other files are either generated using the information found in this
ground truth file, or are checked against this ground truth file. The
different parts of this proposal are as follows:

1. Ground truth file syntax: There will be a single ground truth file
for the entire public API surface of the libc. This file will be human
and machine editable. I propose that it be a Python file named
standard_api.py defining a dictionary in the following manner:

API = {
    <header file>: {
        <func name>: {
            "return_type": <return type>,
            "args": [<arg list>],
        }
        ...
    }
    ...
}

2. Generating all public headers: The current code has a few examples
of generating header files. I propose that we generate all public
header files using the information found in standard_api.py. This
requires the addition of a new command to the header configuration
language. Specifically, I propose adding a command named %%func. The
usage of this command can be illustrated with the example of the round
function:

    // math.h.def
    ...
    %%func(round)
    ...

The header generator will replace the line on which the %%func command
is listed with the declaration of the round function as specified in
standard_api.py.

3. Entrypoint wrappers: The current way of adding a C symbol via a
post processing step might not be feasible on all targets. For
example: http://lists.llvm.org/pipermail/libc-dev/2019-October/000000.html.
For such targets, I propose that we generate a C wrapper for the C++
implementation as part of the add_entrypoint_obj rule. Availability of
the ground truth file facilitates the generation of these wrappers in
a straightforward manner. The wrappers also help in the use case of
LTO across app + libc. Moreover, since the wrappers are generated from
the ground truth, their successful compilation ensures that the
implementation conforms to the ground truth.

WDYT?

Thanks,
Siva Chandra

Recently, there was a discussion (http://lists.llvm.org/pipermail/llvm-dev/2019-July/133648.html) about build dependency on Python and the conclusion of that thread is that it’s something we want to avoid.

Since the standard_api.py script is used for generating code (headers and source), could we possibly use tablegen for that purpose (and describe the API surface in a .td file)?

Recently, there was a discussion (http://lists.llvm.org/pipermail/llvm-dev/2019-July/133648.html) about build dependency on Python and the conclusion of that thread is that it's something we want to avoid.

Since the standard_api.py script is used for generating code (headers and source), could we possibly use tablegen for that purpose (and describe the API surface in a .td file)?

Yes. That is certainly possible, and may be a better thing to do.