Compilation Database format

Hello Clang community,

I am currently working on a tool able to generate a compilation database
by catching all exec done, something similar to
https://github.com/rizsotto/Bear but with the end goal of providing more
than just this feature to user (I want to use libclang for providing
autocompletion, refactoring, stats and other stuff related to user's
codebase, independently of any editor or IDE).

I really like the idea of compilation database, and I have a few
questions that were not answered by the manual page:
http://clang.llvm.org/docs/JSONCompilationDatabase.html

Those are my questions:
1) Is that possible to provide a cwd relative to compilation database
file itself? What I aim is possibility for user to move his source
folder from place without having to use sed or regenerate the
compilation database. For example, if compilation database is
"/my/project/compilation_database.json" and my source is in
"/my/project/src/main.cpp" I want to offer cwd as "src/" and source file
as "main.cpp". Is this correct and more than all, is that supported by
libclang?
2) Is that possible to extend compilation_database without breaking
libclang? I would like to add a list of fields named "dependency" to
precise which files the source depends on. This can be used for creating
statistics or simply to know if file (or any dep) has changed without
having to preprocess, thus enabling speedlight on big projects.
For example:
   - cwd "src/"
   - file "main.cpp"
   - command ...
   - dependency "/usr/include/stdio.h"
   - dependency "/usr/include/stdlib.h"

If those feature are not possible with current libclang, and if you
agree, I can work on a patch to make this possible.

Thank you, and thank you again for providing clang/llvm to people.
Pierrick

Hi Pierrick,

Hello Clang community,

I am currently working on a tool able to generate a compilation database
by catching all exec done, something similar to
https://github.com/rizsotto/Bear but with the end goal of providing more
than just this feature to user (I want to use libclang for providing
autocompletion, refactoring, stats and other stuff related to user’s
codebase, independently of any editor or IDE).

I really like the idea of compilation database, and I have a few
questions that were not answered by the manual page:
http://clang.llvm.org/docs/JSONCompilationDatabase.html

Those are my questions:

  1. Is that possible to provide a cwd relative to compilation database
    file itself? What I aim is possibility for user to move his source
    folder from place without having to use sed or regenerate the
    compilation database. For example, if compilation database is
    “/my/project/compilation_database.json” and my source is in
    “/my/project/src/main.cpp” I want to offer cwd as “src/” and source file
    as “main.cpp”. Is this correct and more than all, is that supported by
    libclang?

I don’t think that’s currently supported, but patches are welcome :slight_smile:

  1. Is that possible to extend compilation_database without breaking
    libclang? I would like to add a list of fields named “dependency” to
    precise which files the source depends on. This can be used for creating
    statistics or simply to know if file (or any dep) has changed without
    having to preprocess, thus enabling speedlight on big projects.
    For example:
  • cwd “src/”
  • file “main.cpp”
  • command …
  • dependency “/usr/include/stdio.h”
  • dependency “/usr/include/stdlib.h”

If those feature are not possible with current libclang, and if you
agree, I can work on a patch to make this possible.

Two part answer:
a) regarding local extensions to the command line database: I’d suggest not to do them; the probability we hit incompatibilities seem high enough, and it seems easy to just put another json file next to the compilation database having all your extra data
b) regrading dependencies: I’m not sure the compilation database is the right venue for that; that’s why we have build systems; if you want dependencies, you’ll immediately need information on how to build those dependencies, and suddenly you have a new build system

Cheers,
/Manuel

Thank you for your quick answer.

I have a few precisions to say (inline).

> Hi Pierrick,
>
> Hello Clang community,
>
> I am currently working on a tool able to generate a compilation
> database by catching all exec done, something similar to
> https://github.com/rizsotto/__Bear
> <https://github.com/rizsotto/Bear> but with the end goal of
> providing more than just this feature to user (I want to use
> libclang for providing autocompletion, refactoring, stats and
> other stuff related to user's codebase, independently of any
> editor or IDE).
>
> I really like the idea of compilation database, and I have a few
> questions that were not answered by the manual page:
> http://clang.llvm.org/docs/__JSONCompilationDatabase.html
> <http://clang.llvm.org/docs/JSONCompilationDatabase.html>
>
> Those are my questions: 1) Is that possible to provide a cwd
> relative to compilation database file itself? What I aim is
> possibility for user to move his source folder from place without
> having to use sed or regenerate the compilation database. For
> example, if compilation database is
> "/my/project/compilation___database.json" and my source is in
> "/my/project/src/main.cpp" I want to offer cwd as "src/" and
> source file as "main.cpp". Is this correct and more than all, is
> that supported by libclang?
>
> I don't think that's currently supported, but patches are welcome :slight_smile:

When I will be working on this part of my project, I'll work around
this.

>
> 2) Is that possible to extend compilation_database without
> breaking libclang? I would like to add a list of fields named
> "dependency" to precise which files the source depends on. This
> can be used for creating statistics or simply to know if file (or
> any dep) has changed without having to preprocess, thus enabling
> speedlight on big projects.
> For example:
> - cwd "src/"
> - file "main.cpp"
> - command ...
> - dependency "/usr/include/stdio.h"
> - dependency "/usr/include/stdlib.h"
>
> If those feature are not possible with current libclang, and if
> you agree, I can work on a patch to make this possible.
>
> Two part answer: a) regarding local extensions to the command line
> database: I'd suggest not to do them; the probability we hit
> incompatibilities seem high enough, and it seems easy to just put
> another json file next to the compilation database having all your
> extra data

I don't really see the risk here. Well, I don't want libclang to
understand the extra information I put in the json file, just ignore
what it does not know. By doing this, you allow people who want to use
this format to put additional information in the same file. Using two
different json seems complicated to me, the aim is to just get one file
with the whole content. Well, I can still hack the file and put my
extension under a comment, and parse it anyway. I'll see what I do when
I come at libclang interaction. I suppose it will be possible to discuss
around a patch. And I plan to use this format for other languages than
C/C++, thus extensions could be welcome to provide additional
information than command line.

> b) regrading dependencies: I'm not sure the compilation database is
> the right venue for that; that's why we have build systems; if you
> want dependencies, you'll immediately need information on how to build
> those dependencies, and suddenly you have a new build system
>

I get your point but I don't want to recreate another build system. I
just express dependencies of one compilation unit. I do not even speak
about linked executable or so, just the object files. I still think this
information could be used by tools like ccache or my project, to know if
file needs recompilation without preprocessing (which is faster than
compilation, but not fast as stating XX files, could be a big difference
for huge projects (like Linux)).

> Cheers,
> /Manuel
>

Pierrick

Hi Pierrick,

You also have the option of writing your own CompilationDatabase, and using it in place of the builtin options. You get the benefits of not having to modify any internal clang code, complete control over which files are chosen for compilation/analysis (assuming you’re using a libtooling), and you can add whatever extra information you need to it as long as it conforms to the clang::tooling::CompilationDatabase API.

  • Neil

(API at: http://clang.llvm.org/doxygen/classclang_1_1tooling_1_1CompilationDatabase.html)

Hello Neil,

That seems like a good idea, getting the butter and the money of the butter. However, as pointed Laszlo (who is the author of Bear tool I quoted above), it would create trouble in minds of user, thinking they use standard compilation database, when it is not.

And even if I patch clang to accept mine, users of old versions will still have trouble (and there will be plenty of them... in the beginning). Thus, I will probably go for another format, and allows user to get the standard compilation database, in case he uses a tool needing it. But the tool Bear already does this job (and well), so I'm not sure to produce this.

Well, thank you all for information. Such a nice first experience on this mailing list :).

Pierrick