Embedding Compiler Invocation in Object Files

I wanted to revive the idea of adding a feature that would have clang embed its invocation in the created object file (like a compilation database, but internal to the object file). I know this was discussed on the mailing list a few years ago, but as far as I’m aware, there wasn’t any progress on this. My use case for this is debugging (rather than reconstruct a clang AST from debug info, just re-parse the correct translation unit and get the complete AST), but I know there are other people interested in this functionality as well. For my use case, I think I’d be happy associating this with the DWARF compilation unit (how to represent this would be something to be discussed, putting it into DW_AT_producer is probably simplest, but there may be other options). Thoughts?

I would not be keen on adding this sort of stuff to DW_AT_producer.

There’s a DW_AT_description attribute that might be appropriate for this purpose.

–paulr

Most obvious question first, which invocation should that be? The
original clang command? The -cc1 invocation?

Joerg

I wanted to revive the idea of adding a feature that would have clang
embed its invocation in the created object file (like a compilation
database, but internal to the object file). I know this was discussed on
the mailing list a few years ago, but as far as I'm aware, there wasn't any
progress on this. My use case for this is debugging (rather than
reconstruct a clang AST from debug info, just re-parse the correct
translation unit and get the complete AST), but I know there are other
people interested in this functionality as well. For my use case, I think
I'd be happy associating this with the DWARF compilation unit (how to
represent this would be something to be discussed, putting it into
DW_AT_producer is probably simplest, but there may be other options).
Thoughts?

If you wish to embed it into the DWARF data, then there is a well tested
vendor extension that Apple has in LLVM: DW_AT_APPLE_flags.

That said, I was already looking at adding similar functionality for
compatibility with GCC (-frecord-gcc-switches). The way that this works is
that there is a special section embedded which contains an array of C
strings which is the command line. The recorded invocation is the
*compiler* invocation, not the driver invocation.

> I wanted to revive the idea of adding a feature that would have clang
embed
> its invocation in the created object file (like a compilation database,
but
> internal to the object file).

Most obvious question first, which invocation should that be? The
original clang command? The -cc1 invocation?

There isn't necessarily a 1-to-1 mapping so this is tricky. For Keno's use
case cc1 is probably fine, but for more general build reconstruction /
tooling stuff the driver is preferable based on my experience.

We may not want to conflate the two use cases (since I suspect the CC1 case
is much more straightforward).

-- Sean Silva

I wanted to revive the idea of adding a feature that would have clang
embed its invocation in the created object file (like a compilation
database, but internal to the object file). I know this was discussed on
the mailing list a few years ago, but as far as I'm aware, there wasn't any
progress on this. My use case for this is debugging (rather than
reconstruct a clang AST from debug info, just re-parse the correct
translation unit and get the complete AST), but I know there are other
people interested in this functionality as well. For my use case, I think
I'd be happy associating this with the DWARF compilation unit (how to
represent this would be something to be discussed, putting it into
DW_AT_producer is probably simplest, but there may be other options).
Thoughts?

If you wish to embed it into the DWARF data, then there is a well tested
vendor extension that Apple has in LLVM: DW_AT_APPLE_flags.

That said, I was already looking at adding similar functionality for
compatibility with GCC (-frecord-gcc-switches). The way that this works is
that there is a special section embedded which contains an array of C
strings which is the command line. The recorded invocation is the
*compiler* invocation, not the driver invocation.

Neat; I didn't know about this GCC option. Apparently there is also
a -grecord-gcc-switches which uses DW_AT_producer:

"""
This switch causes the command-line options used to invoke the compiler
that may affect code generation to be appended to the DW_AT_producer
attribute in DWARF debugging information. The options are concatenated with
spaces separating them from each other and from the compiler version. It is
enabled by default. See also -frecord-gcc-switches for another way of
storing compiler options into the object file.
""" - https://gcc.gnu.org/onlinedocs/gcc/Debugging-Options.html

-- Sean Silva

I wanted to revive the idea of adding a feature that would have clang
embed its invocation in the created object file (like a compilation
database, but internal to the object file). I know this was discussed on
the mailing list a few years ago, but as far as I'm aware, there wasn't any
progress on this. My use case for this is debugging (rather than
reconstruct a clang AST from debug info, just re-parse the correct
translation unit and get the complete AST), but I know there are other
people interested in this functionality as well. For my use case, I think
I'd be happy associating this with the DWARF compilation unit (how to
represent this would be something to be discussed, putting it into
DW_AT_producer is probably simplest, but there may be other options).
Thoughts?

If you wish to embed it into the DWARF data, then there is a well tested
vendor extension that Apple has in LLVM: DW_AT_APPLE_flags.

That said, I was already looking at adding similar functionality for
compatibility with GCC (-frecord-gcc-switches). The way that this works is
that there is a special section embedded which contains an array of C
strings which is the command line. The recorded invocation is the
*compiler* invocation, not the driver invocation.

Neat; I didn't know about this GCC option. Apparently there is also
a -grecord-gcc-switches which uses DW_AT_producer:

"""
This switch causes the command-line options used to invoke the compiler
that may affect code generation to be appended to the DW_AT_producer
attribute in DWARF debugging information. The options are concatenated with
spaces separating them from each other and from the compiler version. It is
enabled by default. See also -frecord-gcc-switches for another way of
storing compiler options into the object file.
""" - https://gcc.gnu.org/onlinedocs/gcc/Debugging-Options.html

Yeah, the -grecord-gcc-switches is slightly different in that it records a
subset of them (code generation effecting ones only).

There is an alternative which comes partially with -fembed-bitcode=all option is that the compiler invocation (-cc1 commands) are embedded in a special section in the object file). This comes as a side-affect of embedding full bitcode but we can clean up the code to embed command line only without bitcode.

Steven