Context: I’ve been looking at experimenting with using Modular Code Generation (My talk at last year’s LLVM dev meeting https://www.youtube.com/watch?v=lYYxDXgbUZ0 is about the best reference at the moment) when building the LLVM project, as a good experiment for the feature. This can/does enforce a stronger layering invariant than LLVM has historically been enforced. So I’m curious to get buy-in and maybe document this if it’s something people like the idea of.
I’m starting this discussion here rather than in an actual code review on llvm-commits since it seems like it could do with a bit of a wider discussion, but once/if the general direction is agreed on, I’ll send a patch for review of specific wording for the LLVM Coding Standards.
Currently the LLVM Coding Standards doesn’t say much/anything about layering. ‘A Public Header File is a Module’ section talks about modules of functionality, mostly trying to describe why a header file should be self contained - but uses anachronistic language about modules that doesn’t line up with the implicit or explicit modules concepts in use today, I think.
I propose making this wording a bit more explicit, including:
Headers should be standalone (include all their dependencies - this is mentioned in the “is a Module” piece, by way of a technique to help ensure this, but not explicit as a goal itself).
Files intended to be included in a particular context (that aren’t safe/benign to include multiple times, in multiple .cpp files, etc) should use a ‘.inc’ or ‘.def’ (.def specifically for those “define a macro, include the header which will reference that macro” style setups we have in a few places).
And the actual layering issue:
3) Each library should only include headers or otherwise reference entities from libraries it depends on. Including in headers and inline functions. A simple/explicit way to put this: every inline function should be able to be moved into a .cpp file and the build (with a unix linker - one that cannot handle circular library dependencies) should still succeed.
This last point is the most interesting - and I hope one that people generally find desirable, so it might not be immediately obvious why it may be contentious or difficult:
LLVM violates this constraint by using inline functions in headers to avoid certain layering constraints that might otherwise cause the build to fail. A couple of major examples I’ve hit are:
TargetSelect.h and similar: This one’s especially tricky - the header is part of libSupport, but each function in here depends on a different subset of targets (creating a circular dependency) - to call the given function the programmer needs to choose the right dependencies to link to or the program will not link.
Clang Diagnostics (work in progress): The diagnostics for each component are in their own component directories, but are then all included from libClangBasic, a library none of those components depends on. (so this isn’t so much an inlining case as #include based circular dependency)
Generally I’d like to get buy-in that stricter layering is desirable, and that these few cases are at least sub-optimal, if accepted for now.
Happy to go into more details about any of this, examples, etc, but I realize this is already a bit long.