Ideas for making llvm-config --cxxflags more useful

Hi,

My understanding of llvm-config --cxxflags is that it is supposed to report
which flags are necessary to compile a program that will include LLVM's
headers and link against its libraries. What it currently reports is
all of the flags which were used to compile LLVM. This is not very useful,
because users are required in most cases to filter out flags they don't
want.

I would like to try to fix this so that it reports only the bare minimum
of required flags. As an example here all the flags that it reports in
my autoconf build of llvm:

-I/usr/local/llvm/3.8/include
-D_DEBUG
-D_GNU_SOURCE
-D__STDC_CONSTANT_MACROS
-D__STDC_FORMAT_MACROS
-D__STDC_LIMIT_MACROS
-O3
-fomit-frame-pointer
-std=c++11
-fvisibility-inlines-hidden
-fno-exceptions
-fno-rtti
-fPIC
-ffunction-sections
-fdata-sections
-Wcast-qual

Of these flags the only ones that are really required are (c++ experts
please correct me if I'm wrong here):

-I/usr/local/llvm/3.8/include
-D__STDC_CONSTANT_MACROS
-D__STDC_FORMAT_MACROS
-D__STDC_LIMIT_MACROS
-std=c++11

Technically the -D__STDC* macros are only required if you include
Support/DataTypes.h, but I think that is hard to avoid.

As I understand, The rest of the flags are not required in 100% of the
use cases.

My proposal for fixing this is to remove everything but the 5 options listed
above.

For flags like -fno-rtti (are there others?) that are required in some cases
(I think -fno-rtti is required only if you sub-class LLVM objects), I would propose
adding a separate flag like --uses-rtti. This would give users more fine-grained
control over which flags they use, and also would let them choose the correct
flag since, for example, -fno-rtti is not understood by MSVC.

How do people feel about this proposal?

Thanks,
Tom

Hey Tom,

I’m not a regular user of llvm-config, but this sounds completely right to me, and it would be a significant improvement over what we have now.

The only question I want to raise is, what about NDEBUG? There are headers that conditionalize on NDEBUG, which could lead to ABI incompatibility in the C++ API.

Thanks for doing this,
-Chris

Is it something that can be fixed or would it be too complicated to handle?
It would be nice in general to be able to link a “Non assert” build of Clang with an “Assert" version of LLVM (and vice-versa).

And on the original topic: +1 for Tom proposal/goal, it makes sense to me.

The only question I want to raise is, what about NDEBUG? There are headers that conditionalize on NDEBUG, which could lead to ABI incompatibility in the C++ API.

Is it something that can be fixed or would it be too complicated to handle?
It would be nice in general to be able to link a “Non assert” build of Clang with an “Assert" version of LLVM (and vice-versa).

Michael

The problem with not including it is that if your LLVM build is not ABI compatible with different values of NDEBUG.

-Chris

If as Chris said it is required for ABI compatibility, my understanding is that you don’t really have any choice: if you want to compile/link at the C++ level with LLVM you’ll have to include this flag or you’ll run into strange runtime issue.

Hey Tom,

I’m not a regular user of llvm-config, but this sounds completely right to me, and it would be a significant improvement over what we have now.

The only question I want to raise is, what about NDEBUG? There are headers that conditionalize on NDEBUG, which could lead to ABI incompatibility in the C++ API.

Is it something that can be fixed or would it be too complicated to handle?

I think this would need a case-by-case determination that I haven’t done. There are quite a few occurrences of NDEBUG in headers:

grep -r NDEBUG ./include/ | wc -l
77

I imagine many of these aren’t actually ABI issues, but I know that ABI issues have existed in the past.

-Chris

Mehdi Amini <mehdi.amini@apple.com> writes:

Hey Tom,

I’m not a regular user of llvm-config, but this sounds completely
right to me, and it would be a significant improvement over what we
have now.

The only question I want to raise is, what about NDEBUG? There are
headers that conditionalize on NDEBUG, which could lead to ABI
incompatibility in the C++ API.

Is it something that can be fixed or would it be too complicated to handle?
It would be nice in general to be able to link a “Non assert” build of
Clang with an “Assert" version of LLVM (and vice-versa).

And on the original topic: +1 for Tom proposal/goal, it makes sense to me.

It's probably possible, but I think it's quite a bit of work from where
we are now. A lot of our headers are not ABI compatible between NDEBUG
and !NDEBUG.

llvm-config does already provide --assertion-mode to check NDEBUG in
particular. I'm not sure if that's sufficient though, since it's easy to
do the wrong thing when you need to use two different calls to get the
right set of flags to be able to grok llvm headers.

To add a few data points. I did a quick scan of the headers and here are the ABI incompatibilities I came up with:

/Users/cbieneman/dev/open-source/llvm/include/llvm/Analysis/ScalarEvolutionExpander.h:86-88
/Users/cbieneman/dev/open-source/llvm/include/llvm/CodeGen/FunctionLoweringInfo.h:114-117
/Users/cbieneman/dev/open-source/llvm/include/llvm/CodeGen/MachineScheduler.h:244-248,623-627
/Users/cbieneman/dev/open-source/llvm/include/llvm/CodeGen/RegAllocPBQP.h:323-325
/Users/cbieneman/dev/open-source/llvm/include/llvm/CodeGen/ScheduleDAGInstrs.h:260-262
/Users/cbieneman/dev/open-source/llvm/include/llvm/CodeGen/SelectionDAG.h:295-297
/Users/cbieneman/dev/open-source/llvm/include/llvm/IR/ValueHandle.h:188-190,194-201
/Users/cbieneman/dev/open-source/llvm/include/llvm/MC/MCSchedule.h:27-29,105-107

-Chris

I quickly looked at them and found these that I believe won’t trigger a linked error:

  • class SCEVExpander: has a field dependent on NDEBUG: “const char *DebugType;”

  • class FunctionLoweringInfo: two fields dependent on NDEBUG: SmallPtrSet<const Instruction *, 8> CatchInfoLost and CatchInfoFound

  • class ScheduleDAGMI: has a field dependent on NDEBUG: unsigned NumInstrsScheduled;

  • class SchedBoundary: has a field dependent on NDEBUG: unsigned MaxObservedStall;

  • class NodeMetadata: has a field dependent on NDEBUG: bool everConservativelyAllocatable;

  • class ScheduleDAG: has a field dependent on NDEBUG: bool StressSched;

  • class SelectionDAG: has a field dependent on NDEBUG: std::map<const SDNode *, std::string> NodeGraphAttrs;

  • class AssertingVH conditionally inherit from ValueHandleBase and has a dependent field: Value *ThePtr;

  • class MCProcResourceDesc and MCSchedClassDesc have a field dependent on NDEBUG: const char *Name;

It would be friendly to rename this flag in LLVM (e.g. LLVM_DEBUG), so that users can toggle whether they’re building their code with assertions independently of the LLVM that they’re linking.

David

>
>>
>>
>> Hey Tom,
>>
>> I’m not a regular user of llvm-config, but this sounds completely right to me, and it would be a significant improvement over what we have now.
>>
>> The only question I want to raise is, what about NDEBUG? There are headers that conditionalize on NDEBUG, which could lead to ABI incompatibility in the C++ API.
>
> Is it something that can be fixed or would it be too complicated to handle?

I think this would need a case-by-case determination that I haven’t done. There are quite a few occurrences of NDEBUG in headers:

I think the issue with NDEBUG has been raised in the past, but I can't
find the email thread. As I recall the suggestion was to replace NDEBUG
in headers with something like LLVM_NDEBUG. I think this makes sense.
Is there any downside to making this change?

-Tom

[Ooops, sent to the old list address by mistake]

I looked at the file, and this didn't seem true (e.g.
GetElementPtrInst::init is still out-of-line). But then I realized you
mean virtual functions, so these classes no longer have a key
function.

This is probably Pete's r240588. I suppose we could add key functions
to these classes (even if they're not used for anything). I'm not sure
how we'd prevent this from regressing though :-/

- Hans

> [Ooops, sent to the old list address by mistake]
>
>>
>> For flags like -fno-rtti (are there others?) that are required in some
cases
>> (I think -fno-rtti is required only if you sub-class LLVM objects), I
would propose
>> adding a separate flag like --uses-rtti. This would give users more
fine-grained
>> control over which flags they use, and also would let them choose the
correct
>> flag since, for example, -fno-rtti is not understood by MSVC.
>
> There appears to be a regression in LLVM 3.7, which means that you must
compile with -fno-rtti if you include llvm’s Instructions.h. The issue is
that a few of the classes (ICmpInst, GetElementPtrInst and PHINode) are now
defined entirely in the header, so every compilation unit that includes the
header will emit them. These classes all inherit from Instruction
(indirectly via CmpInst in the case of ICmpInst) and so fail to link if
compiled with -fno-rtti, because they can’t find the vtable for ICmpInst.

I looked at the file, and this didn't seem true (e.g.
GetElementPtrInst::init is still out-of-line). But then I realized you
mean virtual functions, so these classes no longer have a key
function.

This is probably Pete's r240588. I suppose we could add key functions
to these classes (even if they're not used for anything). I'm not sure
how we'd prevent this from regressing though :-/

In theory the LLVM style guide mandates key functions for all dynamic
classes (under the claim of build performance - avoiding duplicate vtable
emission, etc). We've never strongly enforced it though - if we really
wanted to, we could do so as Clang has a warning that triggers whenever a
vtable is emitted weakly (which is what happens when there isn't a key
function).

- David

Sorry about this. I think if I’d finished the work to remove vtables from Value then it wouldn’t be an issue, but I put that on hold due to performance concerns.

I’ve almost finished a patch to add back in either out of line destructors or anchor methods. We seem to use one or the one, relatively inconsistently.

What i’ve gone for is that if a class already had an inline destructor then i left it alone and added an anchor method. Otherwise I added an out of line destructor.

Now if I compile Instructions.cpp with -Wweak-vtable, the only warnings given are:

…/include/llvm/PassSupport.h:226:8: warning: ‘PassRegistrationListener’ has no out-of-line virtual method definitions;
its vtable will be emitted in every translation unit [-Wweak-vtables]
struct PassRegistrationListener {
^
In file included from …/lib/IR/Instructions.cpp:16:
In file included from …/lib/IR/LLVMContextImpl.h:19:
In file included from …/lib/IR/ConstantsContext.h:25:
…/include/llvm/Support/raw_ostream.h:321:7: warning: ‘raw_pwrite_stream’ has no out-of-line virtual method
definitions; its vtable will be emitted in every translation unit [-Wweak-vtables]
class raw_pwrite_stream : public raw_ostream {
^
…/include/llvm/Support/raw_ostream.h:539:7: warning: ‘buffer_ostream’ has no out-of-line virtual method definitions;
its vtable will be emitted in every translation unit [-Wweak-vtables]
class buffer_ostream : public raw_svector_ostream {

However, I do wonder how useful my patch for Instructions.h (and constants.h too which needed work), when you can’t #include PassSupport without -fno-rtti.

So, i’l get a patch out soon to look at, but i would like to discuss whether any of you think its worth it given the other warnings which still prevent no-rtti.

Cheers,
Pete

Here’s the patch. It builds against ToT. If we need the old behavior regarding rtti and Instructions.h then i can put it out for review.

Cheers,
Pete

vtable.diff (23.9 KB)

I’ve almost finished a patch to add back in either out of line destructors
or anchor methods. We seem to use one or the one, relatively
inconsistently.

What i’ve gone for is that if a class already had an inline destructor
then i left it alone and added an anchor method. Otherwise I added an out
of line destructor.

That seems possibly the opposite of what might be desired - if the class
has no user-declared dtor, it's implicitly inline and possibly trivial (&
doesn't suppress implicit move and copy ops (rule of 0/5)) - I'd avoid
adding a dtor to a class that doesn't have/need one (& whenever I see
empty-bodied dtors I try to remove them - not always possible).

I was probably the first one to use (or at least do a lot of them) the
explicit anchors in an attempt to make LLVM -Wweak-vtables clean but never
got around to going all the way there. I think Chris Lattner even advocated
for not out-of-lining dtors just to make the vtables strong, instead using
an explicit anchor. But I'm pretty ambivalent about it. In theory the style
guide advocates this for build perf reasons - which would be interesting to
test & see if it's actually relevant. If it's a matter of
correctness/functionality as this thread seems to imply (but I haven't read
it enough to really understand what's being dealt with here, to be honest)
then I guess we'll have to decide on some approach... *shrug*

- Dave

Thanks for the explanation. I knew we wouldn’t be able to inline the destructors when they are out of line, but i didn’t know about the rule of 0/5 which is also interesting.

So I didn’t really like the look of what I did either which is why i thought it best to explain it. I’m happy to change over what I’ve done to anchor’s, but its unfortunate that it will then differ from some of the other sibling classes.

Perhaps this is moot* though, if I can get back to finishing the devirtualisation without any perf issues.

  • moot for ToT perhaps. I’m happy to get something for 3.7 if we can work out what it should be. The options seem to be
  • Do nothing to the code, except perhaps change llvm-config as David said
  • This patch
  • The equivalent to this with anchor’s.

Anyone got a strong opinion which is best, especially as I don’t want to do anything too big for 3.7 at this point.

Cheers,
Pete