Extending CMAKE_EXPORT_COMPILE_COMMANDS

Hi List,

Crossposting from cmake.devel. I found out that a particular feature in
CMake was contributed by the clang community. I'm working on a small tool
which requires the same information + some additional. That is, in its
currents form it only stores the compile commands and not the link commands.
Is this something the clang community would be interested in too (for
tooling at the link level, if such is planned)?

The question would be if the link commands would be stored in the same file,
resulting in a "build database" in stead of a compile database or in a
separate file. Any input on this topic is welcome.

Cheers,

Bertjan

== Original message ==

I recently found out about the CMAKE_EXPORT_COMPILE_COMMANDS option while
getting myself familiar with clang. In a tool [1] I'm working on (or
rather,
start to work again) I need something similar.

The current CMAKE_EXPORT_COMPILE_COMMANDS option only exports compile
commands (as its name suggest). However, in my tool I build and visualize
the full graph (i.e. source, headers, objectfiles, libaries). For this I
also would need the link commands. Currently I've a wrapper around the C++
compiler, which works fine, but after having seen this option I realized
that this would be way more convenient. Is it possible to get a similar
option for exporting link commands or perhaps both:

CMAKE_EXPORT_LINK_COMMANDS
CMAKE_EXPORT_COMPILE_AND_LINK_COMMANDS

I'm willing to work on a patch, given that a) the option has a change to
get
included, b) I get some pointers on where to start to make it happen.

Cheers,

Bertjan

[1] https://gitorious.org/cpp-dependency-analyzer/cpp-dependency-analyzer

Hi List,

Crossposting from cmake.devel. I found out that a particular feature in
CMake was contributed by the clang community. I’m working on a small tool
which requires the same information + some additional. That is, in its
currents form it only stores the compile commands and not the link commands.
Is this something the clang community would be interested in too (for
tooling at the link level, if such is planned)?

The question would be if the link commands would be stored in the same file,
resulting in a “build database” in stead of a compile database or in a
separate file. Any input on this topic is welcome.

The reason why you might want a slightly diffferent file format is the key.
For the compilation database, the usual question is “how do I parse this TU?”. What would you key on for link commands? The output file?

Cheers,
/Manuel

Hi Manuel, Even in your usual case — “how do I parse this TU” — the input file is not a discriminating key : this is why we get CompileCommands from the CompilationDatabase. At some point, you need to choose one CompilationCommand amongst several. I believe the real discriminating key is the ouput file (with the build system I know). Beside, this would enable other usages for the CompilationDatabase. It seems to me it is good to have the input file as a key, but this is not enough. Cheers,

The reason why you might want a slightly diffferent file format is the key.
For the compilation database, the usual question is “how do I parse this TU?”. What would you key on for link commands? The output file?

Hi Manuel,

Even in your usual case — “how do I parse this TU” — the input file is not a discriminating key : this is why we get CompileCommands from the CompilationDatabase. At some point, you need to choose one CompilationCommand amongst several. I believe the real discriminating key is the ouput file (with the build system I know). Beside, this would enable other usages for the CompilationDatabase. It seems to me it is good to have the input file as a key, but this is not enough.

Well, when you want to analyze a file, you usually need the set of parameters it can be parsed with, thus the current design of the file. For linking, this doesn’t seem to be necessary…

I especially wouldn’t want to change the key in the current file format due to adding link commands. I’m currently slightly leaning towards an extra file, because that leads to tools that only care about one part being simpler (and once you add one use case, you start adding other use cases ;), but it’s not a clear-cut decision.

Cheers,
/Manuel

I especially wouldn't want to change the key in the current [compilation
database] file format due to adding link commands. I'm currently slightly
leaning towards an extra file, because that leads to tools that only care about
one part being simpler (and once you add one use case, you start adding other
use cases ;), but it's not a clear-cut decision.

Correct me if I'm wrong, but the current file doesn't have an explicit
key. It looks like:

    [
      { "directory": ...,
        "command": ...,
        "file": ... },
      ...
    ]

Not like:

    { "<file>": [{
        "directory": ...,
        "command": ... },
        ...
      ],
      ...
    }

So users of this compilation database (currently) would have to build
their own index anyway, if they decide they need it for performance
reasons.

Given the above, I don't think it would hurt to add an extra "output"
element. And while we're at it --before there are too many users--
rename "file" to "input" or something like that. Generate both "file"
and "input" entries for a while, to ease the transition, but mark
"file" as deprecated.

Tools for working on C/C++ source files would, I imagine, independently
obtain a list of source files (e.g. "find . -name '*.cpp'") and *then*
look up each file in the compilation database. So it doesn't matter if
the compilation database contains entries for non-source files (like a
link command with input file "something.o"). Please correct me if your
use case differs.

Similarly, my vote would be for something like
CMAKE_EXPORT_BUILD_COMMANDS instead of separate
*_COMPILE_COMMANDS and *_LINK_COMMANDS.

I'm willing to work on a patch, given that [...] I get some pointers on
where to start to make it happen.

The documentation for this compilation database is at
http://clang.llvm.org/docs/JSONCompilationDatabase.html

The feature was added to cmake in these 2 commits:
http://cmake.org/gitweb?p=cmake.git;a=commitdiff;h=fe07b055
http://cmake.org/gitweb?p=cmake.git;a=commitdiff;h=5674844d

Finally, I must point out that I am a very recent member of the clang
community so take my input with a grain of salt.

--Dave.

I especially wouldn’t want to change the key in the current [compilation
database] file format due to adding link commands. I’m currently slightly

leaning towards an extra file, because that leads to tools that only care about
one part being simpler (and once you add one use case, you start adding other
use cases ;), but it’s not a clear-cut decision.

Correct me if I’m wrong, but the current file doesn’t have an explicit
key. It looks like:

[
{ “directory”: …,
“command”: …,
“file”: … },

]

Not like:

{ “”: [{
“directory”: …,
“command”: … },

],

}

So users of this compilation database (currently) would have to build
their own index anyway, if they decide they need it for performance
reasons.

Given the above, I don’t think it would hurt to add an extra “output”
element. And while we’re at it --before there are too many users–
rename “file” to “input” or something like that. Generate both “file”
and “input” entries for a while, to ease the transition, but mark
“file” as deprecated.

Well, the idea is that “file” specified a main source file for a TU.

Tools for working on C/C++ source files would, I imagine, independently
obtain a list of source files (e.g. “find . -name ‘*.cpp’”) and then
look up each file in the compilation database. So it doesn’t matter if
the compilation database contains entries for non-source files (like a
link command with input file “something.o”). Please correct me if your
use case differs.

Ah, yes, theoretically it doesn’t matter, as long as the tools all understand it and can throw away things they don’t need.

Similarly, my vote would be for something like
CMAKE_EXPORT_BUILD_COMMANDS instead of separate
*_COMPILE_COMMANDS and *_LINK_COMMANDS.

I’m willing to work on a patch, given that […] I get some pointers on

where to start to make it happen.

The documentation for this compilation database is at
http://clang.llvm.org/docs/JSONCompilationDatabase.html

The feature was added to cmake in these 2 commits:
http://cmake.org/gitweb?p=cmake.git;a=commitdiff;h=fe07b055
http://cmake.org/gitweb?p=cmake.git;a=commitdiff;h=5674844d

Finally, I must point out that I am a very recent member of the clang
community so take my input with a grain of salt.

You raise good points - as I said, which way to chose is not clear cut, and I’d be willing to say whoever implements it wins :wink:

Cheers,
/Manuel

Manuel Klimek wrote:

> I especially wouldn't want to change the key in the current
> [compilation database] file format due to adding link commands. I'm
> currently slightly leaning towards an extra file, because that leads to
> tools that only
care about
> one part being simpler (and once you add one use case, you start adding
other
> use cases ;), but it's not a clear-cut decision.

Correct me if I'm wrong, but the current file doesn't have an explicit
key. It looks like:

   [
     { "directory": ...,
       "command": ...,
       "file": ... },
     ...
   ]

Not like:

   { "<file>": [{
       "directory": ...,
       "command": ... },
       ...
     ],
     ...
   }

So users of this compilation database (currently) would have to build
their own index anyway, if they decide they need it for performance
reasons.

Given the above, I don't think it would hurt to add an extra "output"
element. And while we're at it --before there are too many users--
rename "file" to "input" or something like that. Generate both "file"
and "input" entries for a while, to ease the transition, but mark
"file" as deprecated.

Well, the idea is that "file" specified a main source file for a TU.

Tools for working on C/C++ source files would, I imagine, independently
obtain a list of source files (e.g. "find . -name '*.cpp'") and then
look up each file in the compilation database. So it doesn't matter if
the compilation database contains entries for non-source files (like a
link command with input file "something.o"). Please correct me if your
use case differs.

Ah, yes, theoretically it doesn't matter, as long as the tools all
understand it and can throw away things they don't need.

And for the linker part: I'd like to pass the command line as well to find
the libraries against which an executable/librarie is linked as well other
settings passed to the linker such as search directories and rpath.

Similarly, my vote would be for something like
CMAKE_EXPORT_BUILD_COMMANDS instead of separate
*_COMPILE_COMMANDS and *_LINK_COMMANDS.

I'm in favor of this as well, though for me its just a spare time project
so
if others have reasons related to scalability that are important to them, I
can live with a separete file as well. I don't see how this would be
formatted though, to be similar to the current file format. I suppose
something like:

// For compilation
{ "directory": ...,
   "command": ...,
   "input": <sourcefile>,
   "ouput": <sourcefile>.o
}

// For linking
{ "directory": ...,
   "command": ...,
   "input": [<list of files here>] },
   "ouput": <lib_or_exec_name>
}

Eventually one could add an extra level of properties to distinguish
between
the two types:

{
  compilations: [
    { <compilation_objects }
  ],
  linkcommands: [
    { <compilation_objects }
  ]
}

To me that is a bit cleaner, but likely requires more work to existing
tools
that already use the format. In stead of that, an extra option can be set
on
each object:

{ "directory": ...,
   "command": ...,
   "input": [<list of files here>] },
   "ouput": <lib_or_exec_name>
   "type": <link|compilation>
}
  

You raise good points - as I said, which way to chose is not clear cut,
and I'd be willing to say whoever implements it wins :wink:

Hehe, so I implement it however I want for my spare time project and burden
people who use it in their daily work with some over engineered format =:).

No, really, I just brought it up because this CMake extension looked really
useful to me and if it exports the additional information, it makes life
for
me a bit easier. I'm willing to do the implementation, I just don't want to
implement something that is rejected because others would have seen it
differently after they see the implementation.

So I prefer to settle for a solution first, then I'll sit down and add the
feature.

Cheers,

Bertjan

p.s. My reaction time is illustrating for what spare time means in my
context =:)

Looping in chandlerc for an opinion :slight_smile: