`__attribute__((const))` is needed where `const` should have sufficed?

Toy example: (godbolt link)

class C {
    bool m_cond;
    int m_i;
    void method1();
    int f() const;
    C();
};
void C::method1() {
    for(int i=0; i<5; ++i) {
        if (m_cond)
          m_i = f();
    }
}

licm fails to hoist m_cond outside the loop, with the optimization-remark: “Failed to move load with loop invariant address because the loop body may invalidate its value”.
if f is marked __attribute__((const)) and not just const, optimization succeeds.

Is there really a theoretical code for f() const that modifies m_cond?

From the C++ standard:

9.2.8.1 The cv-qualifiers [dcl.type.cv]

4. Any attempt to modify (7.6.19, 7.6.1.5, 7.6.2.2) a const object (6.8.3) during its lifetime (6.7.3) results in undefined behavior.

In general, there’s a feeling among users that const isn’t used nearly as much as it should. See e.g. this

Is there really a theoretical code for f() const that modifies m_cond?

An instance of the C class does not have to be const for a const member function to be invoked. For example, suppose an instance of C is a global variable or a singleton. C::f could obtain a non-const pointer/reference to the same object as this and modify it.

@kuhar - true, and it compiles and runs. You could also just const_cast away. But judging by the standard snippet I think this is UB, isn’t it?

Disclaimer: I am not a language lawyer, but I think it’s perfectly fine because the object was not created as const.

Nope, not UB.

It’s only UB to modify an object that was created as a const object. A const pointer to a non-const object may always be cast to a non-const pointer, and then used to modify the object.

Hopefully this modification drives the point better (here C is created as const):

class C {
public:
    C();
    bool m_cond;
    int method1() const;
    int f() const;
};
C::C() = default;

int C::method1() const {
    int a=0;
    for(int i=0; i<5; ++i) {
        if (m_cond)
          a += f();
    }
    return a;
}

int main() {
    const C c;
    int a = c.method1();
}

Is there a reason not to hoist m_cond outside the loop here?

In this example – once you fix the constructor to initialize m_cond to false, instead of leaving it uninitialized – main is simply optimized to return 0, because it knows the value of c.m_cond is a constant false.

@jyknight thank you, but this is obviously a contrived example and working around the issue doesn’t serve anything. This is the gist of real, much more complex, scenarios.
The question remains: is there a semantic reason why m_cond isn’t hoisted outside the loop? Is there some legal code in f that might change m_cond?

Wasn’t that said before, const_cast<C*>(this)->m_cond = !m_cond; should be legal, no?

Using const_cast to modify an object that was creted as const is UB.
But actually tracking the creation of an object might be too much to ask of a compiler…

Afaik, we don’t track that at all right now. In a case like this we cannot optimize method1 except if we know all uses of it start with a const C created object, which is potentially a use case but probably not easy to show too often.

1 Like

OK, I see what you mean – this is interesting! Simplifying and modifying the example a bit:

struct C {
    C();
    int f() const;
    bool m_cond;
};

int main() {
    const C c;
    if(c.m_cond) c.f();
    if(c.m_cond) c.f();
    if(c.m_cond) c.f();
}

The important difference in this version is that the constructor is not visible to the optimizer, so, the compiler cannot know the value of c.m_cond. Yet, because c is a const complete object, the compiler should know that the value cannot ever change, and merge the branches.

Yet, it does not.

LLVM IR has a pair of intrinsics llvm.invariant.start/end which theoretically allows expressing exactly this “value cannot change after the constructor, until the destructor” constraint. However, Clang doesn’t emit it for const local variables, only global initializers. I’m not sure whether there’s some reason why it wasn’t done, or if simply nobody ever got around to it.

AFAICT, it ought to be correct to emit the pair for const locals, because not only is it UB to modify the object within its lifetime, you also cannot destroy and recreate it in place, per basic.life p10,

Creating a new object within the storage that a const complete object with static, thread, or automatic storage duration occupies, or within the storage that such a const object used to occupy before its lifetime ended, results in undefined behavior.

(That is, even the obnoxiously-often-valid trick of c.~C(); new (&c) C(); isn’t permitted.)

However, sadly, even if Clang did emit this intrinsic, the optimizer still cannot remove the redundant loads. It looks like all the optimization passes that look at it (AliasSetTracker, EarlyCSE, LICM) can only handle indefinite invariant sections (without an end).

For extra brokenness, the presence of TBAA metadata also breaks it. E.g. take this minimized IR example. The second load ideally should be removed by running opt -O3 -S on this:

declare ptr @llvm.invariant.start.p0(i64, ptr)
declare void @llvm.invariant.end.p0(ptr, i64, ptr)
declare void @f(ptr %arg)

define i8 @test(ptr %arg) {
  %invst = call ptr @llvm.invariant.start.p0(i64 1, ptr %arg)
  %val1 = load i8, ptr %arg, align 1, !tbaa !0
  call void @f(ptr %arg)
  %val2 = load i8, ptr %arg, align 1, !tbaa !0
  %res = add i8 %val1, %val2
  call void @llvm.invariant.end.p0(ptr %invst, i64 1, ptr %arg)
  ret i8 %res
}

!0 = !{!1, !2, i64 0}
!1 = !{!"_ZTS1C", !2, i64 0}
!2 = !{!"bool", !3, i64 0}
!3 = !{!"omnipotent char", !4, i64 0}
!4 = !{!"Simple C++ TBAA"}

But only if you modify the example by deleting the llvm.invariant.end call AND removing the “!tbaa !0” on the loads, will it remove the redundant load. Unfortunate…

1 Like

@jyknight Thank you! On the bright side, this means having clang add llvm.invariant.start/end on local consts is completely safe :slight_smile: Might be a good place to start.

Yes, Clang doesn’t optimize const variables as well as it could. See this design proposal from 2015 to address that:
https://lists.llvm.org/pipermail/llvm-dev/2015-October/091178.html

I’ve forgotten the details, but this is a deceptively difficult project to take on because of the ways that object memory can transition from being immutable to mutable before destruction.

There is also some overlap here with vptr loads, which are immutable in similar ways, and the challenges of optimizing those are well researched and documented.

1 Like

@rnk fascinating! Do you know if this work advanced since the paper?
Also, if the object was created const (which is a considerable portion, if not most, of cases) there is no C+±legal way for it to become mutable. Perhaps this case is simple enough to attack? Are there subtleties I’m missing? Can you give an example of a case where “object memory can transition from being immutable to mutable before destruction” for an object that was created const?

[class.dtor]

@jrtc27 this applies during destruction. The question was whether an object can be created as const and turn to mutable before destruction.

I believe mutable members can still be modified even if the object is const, not just in const member functions. That is:

struct S {
    mutable int x;
};

const S s;
s.x = 42;

is well-defined.

@jrtc27 This is true, but easy to detect and was handled conservatively in the work mentioned above :

… Hence, we treat const objects with mutable fields as non-const objects with const fields.

And yet, this is not a way in which an object transitions from immutable to mutable before destruction. I’m not aware of any legal ways to achieve that, exotic or contrived as they may be. (I believe the standard snippet above says categorically that this can’t happen),

@rnk I found these: D11826, D13031 and D13603. I couldn’t see in either of these any mention of immutable/mutable transition trouble (but perhaps I missed it?). Is it possible this work was just dropped somehow?

I was very surprised to discover that llvm ignores so much potentially beneficial optimization information. This might be a big-ish deal.