Issue where Lambda capture of reference to globals doesn't actually capture anything

Hello,

We recently encountered an issue where clang has some unexpected behavior with respect to the capture of local references to global variables https://godbolt.org/z/KasP9K.

Most compilers (gcc, MSVC, icc) appear to create a member variable to hold the value of myfoo and have dummy return a size of 16. Clang does not store a member for myfoo (in the AST there is no FieldDecl for myfoo).

This leads to the interesting issue here: https://godbolt.org/z/G59e7M
Where clang and gcc will print out different values.

I don’t know if this is a clang issue (known?), or a {gcc,icc,msvc} issue, or is implementation defined, but any insight on this would be welcome.

-Drew

I confirm that this is a Clang bug, going back at least to Clang 6. I don’t know if it’s been reported on bugs.llvm.org yet.
https://godbolt.org/z/8419ss

It doesn’t matter whether the capture is done as [=] or as . Doing it explicitly as [x=x] is a possible workaround.

–Arthur

Thanks Arthur,

This came up in the context of the HIP backend. The HIP release notes apparently mention that this is currently a bug (I don’t write code for HIP, but apparently HIP cannot access globals in device functions). So while I don’t have a bug number I’m guessing that at least someone knows about it.

Incidentally, [x=x] is the workout around I just suggested to our users.

-Drew

Hello,

We recently encountered an issue where clang has some unexpected behavior with respect to the capture of local references to global variables https://godbolt.org/z/KasP9K.

Most compilers (gcc, MSVC, icc) appear to create a member variable to hold the value of myfoo and have dummy return a size of 16. Clang does not store a member for myfoo (in the AST there is no FieldDecl for myfoo).

This leads to the interesting issue here: https://godbolt.org/z/G59e7M
Where clang and gcc will print out different values.

I don’t know if this is a clang issue (known?), or a {gcc,icc,msvc} issue, or is implementation defined, but any insight on this would be welcome.

N4861 subclause 7.5.5.2 [expr.prim.lambda.capture] paragraph 11 seems to apply. The local entity being captured is a reference and it is not odr-used (see [basic.def.odr] paragraph 4).

I’m not a standards reading expert but does

Note 7:
An id-expression that is not an odr-use refers to the original entity, never to a member of the closure type.
However, such an id-expression can still cause the implicit capture of the entity.
end note

From the text imply that the program https://godbolt.org/z/feKxdK is actually implementation defined? Or does gcc have a bug here?

-Drew

I’m not a standards reading expert but does

Note 7:
An id-expression that is not an odr-use refers to the original entity, never to a member of the closure type.
However, such an id-expression can still cause the implicit capture of the entity.
end note

From the text imply that the program https://godbolt.org/z/feKxdK is actually implementation defined? Or does gcc have a bug here?

GCC has a bug, according to the standard wording. The mention of myfoo does not constitute an odr-use, so is not rewritten to use the capture. Clang’s behavior is correct per the standard wording.

The standard rule is certainly surprising in this particular case. I think the rule in question is driven by the desire for adding a capture-default to a valid lambda to not change its meaning. For example: https://godbolt.org/z/nrWsvj

Hmm. It’s insane that you can use local variable x inside a lambda that doesn’t capture anything; I wasn’t aware of that. If the formal wording really says you can do that, then that seems like a formal-wording bug to me.
GCC and Clang disagree in a different way about whether it’s okay to dereference a pointer inside a lambda without capturing it; here it’s GCC that is doing the crazy thing, and Clang that is reasonably refusing to compile a reference to x when x hasn’t been captured.
https://godbolt.org/z/bjbr3f

The rules about when compilers are able to “constant-fold” away variables that otherwise would need to be captured, strikes me as similar in spirit to the rules about when compilers are able to “constant-fold” away complicated expressions in constant expressions. Some compilers seem to be trying to be “helpful” by letting the optimizer’s knowledge leak into the front-end, and some are following the same rules as one’s “head compiler” would.

–Arthur

I’m not a standards reading expert but does

Note 7:
An id-expression that is not an odr-use refers to the original entity, never to a member of the closure type.
However, such an id-expression can still cause the implicit capture of the entity.
— end note

From the text imply that the program https://godbolt.org/z/feKxdK is actually implementation defined? Or does gcc have a bug here?

GCC has a bug, according to the standard wording. The mention of myfoo does not constitute an odr-use, so is not rewritten to use the capture. Clang’s behavior is correct per the standard wording.

The standard rule is certainly surprising in this particular case. I think the rule in question is driven by the desire for adding a capture-default to a valid lambda to not change its meaning. For example: https://godbolt.org/z/nrWsvj

Hmm. It’s insane that you can use local variable x inside a lambda that doesn’t capture anything; I wasn’t aware of that.

Aside: “insane” (here) and “crazy” (below) are somewhat more charged language than we’d prefer here.

If the formal wording really says you can do that, then that seems like a formal-wording bug to me.

It’s entirely intentional, though the case that was being thought about at the time was primarily local constant non-references, not local references (though the same rules apply to both).

void f() {
constexpr int k = 5;
{ int arr[k]; };
}

… should obviously be valid. And more broadly, the rule is that anything that you can name without an odr-use is valid. And this isn’t at all special to lambdas; the same thing happens with all nested constructs:

int n;
void g() {
const int q = 6;
int &r = n;
constexpr int *p = &n;
struct A {
void f() {
int arr[q]; // ok
r = *p + 1; // ok
}
};
}

https://godbolt.org/z/avG7qx

GCC and Clang disagree in a different way about whether it’s okay to dereference a pointer inside a lambda without capturing it; here it’s GCC that is doing the crazy thing, and Clang that is reasonably refusing to compile a reference to x when x hasn’t been captured.
https://godbolt.org/z/bjbr3f

Yeah, GCC’s behavior is not in line with the language rules in the pointer case. I’d guess they’re using a more-liberal evaluation mode when evaluating the initializer of the reference, rather than using the strict constant initializer rules, and that more-liberal mode permits them to read the initializer values of all const-qualified variables, not only the ones that are usable in constant expressions. (Clang has a more-liberal mode too, and I’d expect we also have bugs where a conforming program can tell the difference.)

The rules in question are approximately equivalent to: if a variable of const integral or enumeration type, or of reference type, can be constexpr, then it is implicitly constexpr. And constexpr variables can be used from anywhere, without triggering an odr-use, if only their value, not their address, is used. (We don’t actually say such things are implicitly constexpr, and there are corner cases where the outcome is different – in particular, for static data members, where constexpr implies inline but const does not. But “implicitly constexpr” is how they behave.)

If you change your example to declare the pointer to be constexpr then the two compilers agree again.

The rules about when compilers are able to “constant-fold” away variables that otherwise would need to be captured, strikes me as similar in spirit to the rules about when compilers are able to “constant-fold” away complicated expressions in constant expressions. Some compilers seem to be trying to be “helpful” by letting the optimizer’s knowledge leak into the front-end, and some are following the same rules as one’s “head compiler” would.

I think this is at least not the proximal cause in the standard-conforming case – correctly implementing the odr-use rules in the Clang frontend was a non-trivial effort and wasn’t related to constant folding / optimizer knowledge leaking through. But I do think the standard rules here probably originally came from looking at what implementations at the time happened to do and writing that down (eg, use of a const static data member with no definition is OK, because implementations happen to not emit a reference to the symbol), and complexity has built up from that point. I’m not sure where the references-are-implicitly-constexpr part came from, but the const-ints-are-implicitly-constexpr part is a backwards-compatibility concern inherited from C++98.

From a language design perspective, I think it would have made a lot more sense if we’d said that constexpr variables are implicitly static (so the question of capture or of use from other scopes can be answered trivially), and deprecated the “implicit constexpr” behavior for const integral/enumeration variables and references. Where we’ve ended up isn’t a great compromise. But this is really the wrong place to have such discussions :slight_smile:

Yeah, but if you’re explicit about the constexpr it works fine in both compilers: https://godbolt.org/z/xdad8v. Here’s a summary of frontend divergence on this issue: https://godbolt.org/z/KfdKcT. I think basically this is one of those areas where implementation divergence is so significant that no one can write portable code dependent on it, so we should consider simplifying. I understand the desire to make []{ return x; }, [=]{ return x; }, and [x]{ return x; } all do the same thing (though that doesn’t save us from the fact that [x=x]{ return x; } does something different), but the reality is that anyone who is writing code that’s dependent on this behavior is either only using Clang or—worse—depending on the wrong behavior and avoiding Clang. And that doesn’t even always work; consider this example: https://godbolt.org/z/ETMGEq where adding a default capture causes Clang to treat the reference to a global as a reference to const.

I’m not sure where the references-are-implicitly-constexpr part came from

I think this is part of the problem. I think most people would find this at least slightly less confusing:

int foo = 1;
int main() {
  constexpr auto& myfoo = foo;
  auto x = [=]{ return myfoo; };
  std::cout << x() << std::endl;
  foo = 2;
  std::cout << x() << std::endl;
}

But I think probably the best solution would be to consider deprecating the use of constexpr references in capture-by-value lambdas. The argument in many cases is pretty similar to the problems with implicit this capture. Very few people would be surprised by the behavior of this program, for instance:

int foo = 1;
int main() {
  constexpr auto& myfoo = foo;
  auto x = [=, &myfoo]{ return myfoo; };
  std::cout << x() << std::endl;
  foo = 2;
  std::cout << x() << std::endl;
}

The fact that this works the same way currently on all of the major implementations is another bonus.

But this is really the wrong place to have such discussions :slight_smile:

Yeah, I agree that whatever happens, I think this should probably be a Core issue that we should switch this discussion over to the ISO mailing list.

Yeah, but if you’re explicit about the constexpr it works fine in both compilers: https://godbolt.org/z/xdad8v. Here’s a summary of frontend divergence on this issue: https://godbolt.org/z/KfdKcT. I think basically this is one of those areas where implementation divergence is so significant that no one can write portable code dependent on it, so we should consider simplifying. I understand the desire to make []{ return x; }, [=]{ return x; }, and [x]{ return x; } all do the same thing (though that doesn’t save us from the fact that [x=x]{ return x; } does something different), but the reality is that anyone who is writing code that’s dependent on this behavior is either only using Clang or—worse—depending on the wrong behavior and avoiding Clang. And that doesn’t even always work; consider this example: https://godbolt.org/z/ETMGEq where adding a default capture causes Clang to treat the reference to a global as a reference to const.

That’s a really interesting example. Yes, we’re certainly mishandling that per the language rules.

I’m not sure where the references-are-implicitly-constexpr part came from

I think this is part of the problem. I think most people would find this at least slightly less confusing:

int foo = 1;
int main() {
  constexpr auto& myfoo = foo;
  auto x = [=]{ return myfoo; };
  std::cout << x() << std::endl;
  foo = 2;
  std::cout << x() << std::endl;
}

But I think probably the best solution would be to consider deprecating the use of constexpr references in capture-by-value lambdas. The argument in many cases is pretty similar to the problems with implicit this capture. Very few people would be surprised by the behavior of this program, for instance:

int foo = 1;
int main() {
  constexpr auto& myfoo = foo;
  auto x = [=, &myfoo]{ return myfoo; };
  std::cout << x() << std::endl;
  foo = 2;
  std::cout << x() << std::endl;
}

The fact that this works the same way currently on all of the major implementations is another bonus.

But this is really the wrong place to have such discussions :slight_smile:

Yeah, I agree that whatever happens, I think this should probably be a Core issue that we should switch this discussion over to the ISO mailing list.

Sounds reasonable to me.