compilation db question

Hi all,

I wrote a tool (Bear) which generates compilation database. A user feeds the output into a tool which is using Clang tooling libraries. <https://github.com/woboq/woboq_codebrowser/> He runs these tools again a project, which contains .s and .asm files. (These files are compiled as clang -x assembler-with-cpp)

When the .s and .asm files are not in the compilation database, the codebrowser output is not complete. What I’m trying to figure out here, is it a legitimate compilation database which has .s or .asm files in it? Is it the responsibility of the tooling library to ignore/consider those entries even if those are not real compilations? If that’s okay, why linking/preprocessing is not part of the compilation database too?

Any help or pointer some documentation is appreciated.

Thanks,

Laszlo

The compilation database doesn’t really say this, and I kept that unspecified precisely so we could extend it later :wink:
Preprocessing is not an extra step (at least in C++ afaiu it’s very tightly ingrained in the language), so I don’t think pulling it out makes sense.
Linking might make sense to put in there, if somebody has a use case.

I’d generally say that tool infrastructure should be able to figure out what the tool is interested in, so I’d file bugs / send patches where that doesn’t work out of the box.

Cheers,
/Manuel

thanks Manuel,

interesting what you say. can you comment/hint some other issues i was learned from users? would like to build a better understanding what is the responsibility of compilation database generators and tooling infrastructure.

  • compilation flags like -MD, -MT x, -MF x, etc… might cause “duplicate” items in the compilation database. (the only difference would be the presence of these flags) i did modify the command line and filter out these flags. but if i’m following your logic, tooling libs shall notice that there is no semantic difference between the two compilation and run only once. (which is not the case now) would you be happy with that?

  • compilations when a single command compiles (and links) files together are splitted into multiple entries. clang one.c two.c will make ‘[{“file”: “one.c”, “command”: “clang -c one.c”}, {“file”: “two.c”, “command”: “clang -c two.c”}]’ is such transformation is acceptable for you point of view?

  • about linking. (specially static linking.) how would you record that into the existing format?

  • the Bear tool is currently filtering out linker flags. because my understanding was that the output will be used by tools which does not go beyond compilation. is that a wrong assumption too?

regards,

Laszlo

thanks Manuel,

interesting what you say. can you comment/hint some other issues i was learned from users? would like to build a better understanding what is the responsibility of compilation database generators and tooling infrastructure.

  • compilation flags like -MD, -MT x, -MF x, etc… might cause “duplicate” items in the compilation database. (the only difference would be the presence of these flags) i did modify the command line and filter out these flags. but if i’m following your logic, tooling libs shall notice that there is no semantic difference between the two compilation and run only once. (which is not the case now) would you be happy with that?

If there’s no semantic difference, I’d argue that there should only be one entry in the compilation db. On the other hand, I’d not object to patches making the tooling infra deduplicate :slight_smile:

  • compilations when a single command compiles (and links) files together are splitted into multiple entries. clang one.c two.c will make ‘[{“file”: “one.c”, “command”: “clang -c one.c”}, {“file”: “two.c”, “command”: “clang -c two.c”}]’ is such transformation is acceptable for you point of view?

You mean the db will contain both? Sure.

  • about linking. (specially static linking.) how would you record that into the existing format?

Haven’t thought about it in detail yet - but with the late addition of output files, it seems straight forward to record the linker command?

  • the Bear tool is currently filtering out linker flags. because my understanding was that the output will be used by tools which does not go beyond compilation. is that a wrong assumption too?

Well, currently it’s the right assumption :slight_smile: Together we’re free to change it :slight_smile:

- compilation flags like -MD, -MT x, -MF x, etc... might cause "duplicate"
items in the compilation database. (the only difference would be the
presence of these flags) i did modify the command line and filter out these
flags. but if i'm following your logic, tooling libs shall notice that
there is no semantic difference between the two compilation and run only
once. (which is not the case now) would you be happy with that?

I'm filtering those out in ⚙ D27140 Allow clang to write compilation database records for the
self-write logic. Background is that I do not want tools using the
compilation database to accidentally change anything, i.e. rerunning the
tool command should only modify the output file and nothing else.

- compilations when a single command compiles (and links) files together
are splitted into multiple entries. `clang one.c two.c` will make
'[{"file": "one.c", "command": "clang -c one.c"}, {"file": "two.c",
"command": "clang -c two.c"}]' is such transformation is acceptable for you
point of view?

Again, that's what the above is implementing. It seems to be the most
useful behavior.

- about linking. (specially static linking.) how would you record that into
the existing format?

I explicitly ignore linking right now as the consumers I tried so far
have no way to deal with it anyway.

- the Bear tool is currently filtering out linker flags. because my
understanding was that the output will be used by tools which does not go
beyond compilation. is that a wrong assumption too?

I don't filter out any other flags, just because that would be more work
to do.

Joerg