to make the tooling library more widely applicable, we’d like to make the CompilationDatabase architecture extensible, i.e. provide the option of implementing custom adapters to other kinds of compilation databases (other than the currently available JSONCompilationDatabase).
In r160061, I provided a hack that enables this, but we definitely want something nicer.
In the attached patch, I am using LLVM’s plugin registry to enable registering new compilation databases. To provide a new CompilationDatabase, you’d need to subclass CompilationDatabasePlugin and CompilationDatabase and register the former with CompilationDatabasePluginRegistry::Add (as shown for JSONCompilationDatabase). This is not a finished patch for review, I’d just like to get feedback on the approach. What do you think?
Also, several points that need to be addressed:
Error handling: In CompilationDatabase::loadFromDirectory, I currently simply try each plugin until I find a compilation database. Any ideas, what I should do with the error messages, especially if I do not find any compilation database. Should I concatenate them, potentially with the plugin-name as a prefix?
Should I move the JSONCompilationDatabase to a separate set of files as it is now loosely coupled?
to make the tooling library more widely applicable, we'd like to make the CompilationDatabase architecture extensible, i.e. provide the option of implementing custom adapters to other kinds of compilation databases (other than the currently available JSONCompilationDatabase).
In r160061, I provided a hack that enables this, but we definitely want something nicer.
In the attached patch, I am using LLVM's plugin registry to enable registering new compilation databases. To provide a new CompilationDatabase, you'd need to subclass CompilationDatabasePlugin and CompilationDatabase and register the former with CompilationDatabasePluginRegistry::Add (as shown for JSONCompilationDatabase). This is not a finished patch for review, I'd just like to get feedback on the approach. What do you think?
I definitely prefer this approach!
Also, several points that need to be addressed:
- Error handling: In CompilationDatabase::loadFromDirectory, I currently simply try each plugin until I find a compilation database. Any ideas, what I should do with the error messages, especially if I do not find any compilation database. Should I concatenate them, potentially with the plugin-name as a prefix?
Yes, I think it's reasonable to concatenate them with the name of the plug-in, because we don't know which plug-in the user intended to use.
- Should I move the JSONCompilationDatabase to a separate set of files as it is now loosely coupled?
to make the tooling library more widely applicable, we'd like to make the CompilationDatabase architecture extensible, i.e. provide the option of implementing custom adapters to other kinds of compilation databases (other than the currently available JSONCompilationDatabase).
In r160061, I provided a hack that enables this, but we definitely want something nicer.
In the attached patch, I am using LLVM's plugin registry to enable registering new compilation databases. To provide a new CompilationDatabase, you'd need to subclass CompilationDatabasePlugin and CompilationDatabase and register the former with CompilationDatabasePluginRegistry::Add (as shown for JSONCompilationDatabase). This is not a finished patch for review, I'd just like to get feedback on the approach. What do you think?
I definitely prefer this approach!
My main argument against static registries is (bad) experience with
using them in Windows / DLL environments. Looping in Michael Spencer
to verify that this will not be a problem here (I really want the
tooling stuff to not run into problems with Windows down the line).
to make the tooling library more widely applicable, we'd like to make the CompilationDatabase architecture extensible, i.e. provide the option of implementing custom adapters to other kinds of compilation databases (other than the currently available JSONCompilationDatabase).
In r160061, I provided a hack that enables this, but we definitely want something nicer.
In the attached patch, I am using LLVM's plugin registry to enable registering new compilation databases. To provide a new CompilationDatabase, you'd need to subclass CompilationDatabasePlugin and CompilationDatabase and register the former with CompilationDatabasePluginRegistry::Add (as shown for JSONCompilationDatabase). This is not a finished patch for review, I'd just like to get feedback on the approach. What do you think?
I definitely prefer this approach!
My main argument against static registries is (bad) experience with
using them in Windows / DLL environments. Looping in Michael Spencer
to verify that this will not be a problem here (I really want the
tooling stuff to not run into problems with Windows down the line).
Cheers,
/Manuel
This will work if statically linked, but not dynamically unless the
__declspec(dll{export,import}) stuff is sprinkled on. However in this
case it may be possible as the interface seems quite small. Although
it would also require adding sprinkles to any non-stdlib c++ types
that are passed through the interface.
Being limited to static linking on Windows seems fine for now; it's what Daniel's prior solution was doing already, and we know roughly what we need to do to get dynamic linking working better on Windows.
Cool, so I’ll go ahead and finish this patch. There is one remaining issue: If I don’t use anything from a .o file, the linker will simply disregard it, thereby rendering the contained compilation database useless. Is there a standard way to prevent that?
I was looking at adding compilation database support for ninja build
files. However, I ran into the problem that it basically amounted to
rewriting the entire ninja build file parser that is already
implemented inside of ninja. As such, it seems like the very nature of
compilation databases is something that needs to be output by the tool
configuring/doing the build.
Is your thought with this that, for example, ninja would provide a
compilation database plugin reusing its own parser?
For ninja in particular, I have long thought that the correct approach is
to write something which can convert 'ninja -t commands' into the JSON
format, or to build a tool for writing the JSON database directly into
ninja. Then CMake (or anything else that generates the build file) can
inject targets to automatically generate and update this file. This
includes re-building generated source files as necessary.
I think for OSS build systems and systems that are largely
command-line-driven like ninja, the file output and interchange system is a
much better alternative. The plugin support I think is mostly useful for
stateful build systems which may really need to implement the database
using custom logic.
For ninja in particular, I have long thought that the correct approach is to write something which can convert 'ninja -t commands' into the JSON format, or to build a tool for writing the JSON database directly into ninja.
This was my line of thought as well.
Since the C++ stdlib unfortunately doesn't have JSON support, what do
you think about a simpler, more plaintext-y compilation database
(PlaintextCompilationDatabase?); it seems like that might be a win
since it is simpler to output. For example, the format would just be
Just a mirror of the JSON one, but with a format easier to output (god
forbid there's a newline in one of the command lines or filenames). On
the other hand, throwing together a simple "good-enough" JSON writer
isn't *too* difficult; nonetheless, for tools written in C/C++, this
could lower the bar to entry. Do you know what CMake currently does?
Does it have its own mini JSON writer?
We didn't go with the plaintext route for three important reasons:
1) Filenames do have whitespace in them. The format should be robust
against that.
2) Many other tools and languages should have a minimal barrier to read
them. An existing format eases this.
3) It is *incredibly* easy to write a minimal JSON writer that only
supports the features we need.
Manuel contributed the CMake support for the JSON database, and it includes
just such a trivial JSON writer. =] We can easily place a copy of the code
into LLVM or re-license it however it helps.
That said, the latest version of CMake already has support for JSON + Ninja
-- we didn't contribute it, so I don't know what strategy they followed,
but you should look at that and talk to the ninja and CMake developers
before going too far here.
While I also like the plugin concept much better, I am still unsure about the fundamental problem of getting other libraries linked in. The problem is that the linker will simply throw away any object file that has no references to it. After some discussions on IRC, I see two options:
Conditionally compile in a custom registration method, i.e.:
in CompilationDatabase.cpp:
void someInit() { #ifdef USE_CUSTOM_COMPILATION_DATABASES
RegisterCompilationDatabases();
#endif
}
And then implement RegisterCompilationDatabases() together with the custom compilation database.
Pros: Very easy to understand and to use right.
Cons: It would be hard to link together multiple such implementations.
Conditionally compile in custom anchors, i.e;.:
in CompilationDatabase.cpp: #ifdef COMPILATION_DATABASE_ANCHORS
COMPILATION_DATABASE_ANCHORS; #endif
And then e.g. “-DCOMPILATION_DATABASE_ANCHORS=‘extern volatile int MyAnchorSource; static int MyAnchorDest = MyAnchorSource;’”
Pros: Multiple custom compilation databases can be added. Nice clean separation (conditional code can be somewhere on the bottom of “CompilationDatabase.cpp”.
Cons: Seems a bit tangled and easy to get the -D part wrong.
I will try to get #2 to work (possibly simplifying it a bit with more macros :-/). Does anyone see a better solution?
Or maybe:
- add a private field 'hook' in CompilationDatabase (a function pointer
to the user callback),
- initialize it by default to NULL
- add a setter to CompilationDatabase for this hook field
- and use the hook if non null in findCompilationDatabaseFromDirectory
The burden of linking in the other required libraries is left to the
user of the hook --- which seems good to me : we can not generally
decide for the user which libraries should be linked in.
This can probably be extended by changing the hook to a vector (of
hooks). This would allow registering several user backends to a specific
CompilationDatabase instance, and have it try to use them in turn, until
one succeed.
Or maybe:
- add a private field 'hook' in CompilationDatabase (a function pointer to
the user callback),
- initialize it by default to NULL
- add a setter to CompilationDatabase for this hook field
This is the problematic part, as it requires changing the tool's code
for each environment, if I understand you correctly. One of the
reasons we want the plugins is that we can use upstream clang tools
without keeping a branch that changes their code to register our
internal environment stuff.
That said, the latest version of CMake already has support for JSON +
Ninja -- we didn't contribute it, so I don't know what strategy they
followed, but you should look at that and talk to the ninja and CMake
developers before going too far here.
I wrote it and pretty much followed the same thing Manuel did in the
Makefile generator.
The commit which actually adds the feature is trivial:
As Chandler said, it's not in a release yet, but will be in the next release
in a few weeks. Feel free to test the release candidate (I would appreciate
if you did)
I know multiple people who are refusing to work without this any more,
I've been using it since it landed in "next". So consider this part
pretty well tested (at least on the llvm codebase).
Attached is a patch to do most of this restructuring, kindly asking for review.
In particular, it does:
Restructure the compilation database architecture to using LLVM’s registry concept. It should now be possible to link in additional compilation databases.
Separate the JSONCompilationDatabase from CompilationDatabase to show the loose coupling and serve as an example.
I'd like the class-comment for JSONCompilationDatabase to still
include what's now in the file comment, so it's visible in doxygen.
Also, I'd prefer to use a result value to capture the error message
instead of llvm::outs'ing in findCompilationDatabaseFromDirectory.
Perhaps we should also switch this to Diagnostics?