New llvm.commandline named metadata

Hi llvm-dev,

I have an implementation of -frecord-gcc-switches ready for Clang, and a named metadata node seemed like the correct way to approach this on the LLVM side. I have a review at which discusses some of the differences in implementation vs. GCC. A change to the set of "special" named metadata nodes seems like something that warrants an llvm-dev post, and I was not sure who specifically would be interested in reviewing the changes.


Overall, I think we should do this. It’s one of the most popular out of tree extensions that people make to LLVM and clang, and we want it for codeview anyway. We currently don’t emit a command line there. Other than that, I have some questions about how to do it.

How will you associate the command line with the compilation unit to deal with regular (fat, not thin) LTO? It looks like you don’t:

assert(N->getNumOperands() == 1 &&
“llvm.commandline metadata entry can have only one operand”);

IR linking will concatenate the named metadata nodes into a list of all the command lines. What should the semantics be? During regular object linking, I assume the command lines are discarded. Maybe just code that up?

Overall, I think we should do this. It's one of the most popular out of tree extensions that people make to LLVM and clang, and we want it for codeview anyway.

Why isn't OPT_dwarf_debug_flags good enough for codeview?

-- adrian

Wow, I don’t know how I missed that. I touched the codeview parts of this a few weeks ago in r344393, and I looked at DICompileUnit, but the field is called “Flags” and it didn’t occur to me that those would be compiler flags. Today I learned. I knew we wanted this, I just didn’t know it was implemented.

With that in mind, if you want to embed the command line into the object file separately from the debug info, I guess it’s reasonable to put it in the metadata in two places: once in the DICompileUnit and again in the named metadata. It helps keep separate concerns separate.

Hi Reid,

I might be misunderstanding your original question, but in the IR llvm.commandline behaves the same as llvm.ident. The IR linker handles merging multiple named metadata (see test/Linker/commandline.ll), and in the code-object multiple command-lines are emitted separated by null bytes (see test/MC/ELF/commandline.s). The object linker can merge these sections and retain all of the individual command-lines just fine. The assert you mention just requires each individual command-line be comprised of exactly one element.

Giving any meaning to the command-lines relative to other compilation units is up to whoever is interpreting them; they are just a set of strings.