Making lambda function name consistent with GCC?

Hi,

I noticed that when compiling lambda functions, the generated function
names use different conventions than GCC.
Example: https://godbolt.org/z/5qvqKqEe6
The lambda in Clang is named "_Z3barIZ3foovE3$_0EvT_", while the one
in GCC is named "_Z3barIZ3foovEUlvE_EvT_". Their demangled names are
also different ("void bar<foo()::$_0>(foo()::$_0)" vs "void
bar<foo()::{lambda()#1}>(foo()::{lambda()#1})").
Lambdas are not covered by the ABI so this is OK.
However there are use-cases where I find it very inconvenient when
they generate different names. For example, if we are to compare the
performance difference of the same software compiled under Clang and
GCC, the perf stack traces will look very different because of the
naming differences, making it hard to compare.
Is there any particular reason that Clang uses a different naming
convention for lambdas, and would there be push-backs if we were to
make it consistent with GCC?
Thanks.

Redirect to cfe-dev.

Redirect to cfe-dev.

Hi,

I noticed that when compiling lambda functions, the generated function
names use different conventions than GCC.
Example: https://godbolt.org/z/5qvqKqEe6
The lambda in Clang is named "_Z3barIZ3foovE3$_0EvT_", while the one
in GCC is named "_Z3barIZ3foovEUlvE_EvT_". Their demangled names are
also different ("void bar<foo()::$_0>(foo()::$_0)" vs "void
bar<foo()::{lambda()#1}>(foo()::{lambda()#1})").
Lambdas are not covered by the ABI so this is OK.

Actually, they are. See 5.1.8 of the ABI doc (https://github.com/itanium-cxx-abi/cxx-abi)

The reason is that these symbols do escape into object files with external linkage (not something originally anticipated).

However there are use-cases where I find it very inconvenient when
they generate different names. For example, if we are to compare the
performance difference of the same software compiled under Clang and
GCC, the perf stack traces will look very different because of the
naming differences, making it hard to compare.
Is there any particular reason that Clang uses a different naming
convention for lambdas, and would there be push-backs if we were to
make it consistent with GCC?

It would be good to have clang match the ABI. I am not sure how much pain it would be for users to switch though -- perhaps having two manglings and therefore two distinct instances in the same executable. Other than code bloat most would not notice, unless someone put a static var into their lambda operator.

Redirect to cfe-dev.

Hi,

I noticed that when compiling lambda functions, the generated function
names use different conventions than GCC.
Example: https://godbolt.org/z/5qvqKqEe6
The lambda in Clang is named “_Z3barIZ3foovE3$0EvT”, while the one
in GCC is named “Z3barIZ3foovEUlvE_EvT”. Their demangled names are
also different (“void bar<foo()::$_0>(foo()::$_0)” vs “void
bar<foo()::{lambda()#1}>(foo()::{lambda()#1})”).
Lambdas are not covered by the ABI so this is OK.

Actually, they are. See 5.1.8 of the ABI doc
(https://github.com/itanium-cxx-abi/cxx-abi)

The reason is that these symbols do escape into object files with
external linkage (not something originally anticipated).

Could you provide a quick example of this ABI break (where two files compiled with matching compiler (GCC or Clang) link/run correctly, but mismatching in either direction fails to link/run correctly)?

     > Redirect to cfe-dev.
     >
     >>
     >> Hi,
     >>
     >> I noticed that when compiling lambda functions, the generated
    function
     >> names use different conventions than GCC.
     >> Example: https://godbolt.org/z/5qvqKqEe6
    <https://godbolt.org/z/5qvqKqEe6>
     >> The lambda in Clang is named "_Z3barIZ3foovE3$_0EvT_", while the one
     >> in GCC is named "_Z3barIZ3foovEUlvE_EvT_". Their demangled names are
     >> also different ("void bar<foo()::$_0>(foo()::$_0)" vs "void
     >> bar<foo()::{lambda()#1}>(foo()::{lambda()#1})").
     >> Lambdas are not covered by the ABI so this is OK.

    Actually, they are. See 5.1.8 of the ABI doc
    (https://github.com/itanium-cxx-abi/cxx-abi
    <https://github.com/itanium-cxx-abi/cxx-abi>)

    The reason is that these symbols do escape into object files with
    external linkage (not something originally anticipated).

Could you provide a quick example of this ABI break (where two files compiled with matching compiler (GCC or Clang) link/run correctly, but mismatching in either direction fails to link/run correctly)?

hm, it turned out to be not quite the case I was thinking of:

// header.h
template<typename T> int bar (T) {static int i; return i++; }

// the following lambda is attached to 'ctr', and therefore the
// same type in every TU, you need gcc >= 10 to get this right.
// not sure if clang models that correctly (given the template
// mangling bug). And yes, this idiom exists in header files
// -- I'm looking at you, ranges library :slight_smile:
auto ctr = [](){};

// TUa.cc
#include "header.h"

// these instantiations are _Z3barIN3ctrMUlvE_EEiT_ due to the
// above-mentioned attachment
int maker1 () { return bar (ctr); }

// TUb.cc
#include "header.h"
int maker2 () { return bar (ctr); }

Those maker[12] functions are calling the exact same bar instantiation, so should see consistent numbering from the static variable.

hope that helps.

Hi,

I noticed that when compiling lambda functions, the generated function
names use different conventions than GCC.
Example: https://godbolt.org/z/5qvqKqEe6
The lambda in Clang is named "_Z3barIZ3foovE3$_0EvT_", while the one
in GCC is named "_Z3barIZ3foovEUlvE_EvT_". Their demangled names are
also different ("void bar<foo()::$_0>(foo()::$_0)" vs "void
bar<foo()::{lambda()#1}>(foo()::{lambda()#1})").
Lambdas are not covered by the ABI so this is OK.
However there are use-cases where I find it very inconvenient when
they generate different names. For example, if we are to compare the
performance difference of the same software compiled under Clang and
GCC, the perf stack traces will look very different because of the
naming differences, making it hard to compare.
Is there any particular reason that Clang uses a different naming
convention for lambdas, and would there be push-backs if we were to
make it consistent with GCC?

Can you file a bug for this?

-Tom

Filed https://bugs.llvm.org/show_bug.cgi?id=50209.
Thanks.

Hi,

The layout of lambdas is different between GCC and clang. This already results in wrong code when some object files are compiled with GCC and others with clang in some corner cases:

test.h:

   extern int i;
   extern int f();

   inline auto lambda = [](){
     int &ref = i;
     return [&]() { return ref; };
   }();

test1.cc:

   #include "test.h"
   int f() { return sizeof lambda; }
   int i;

test2.cc:

   #include "test.h"
   int main() { return f() != sizeof lambda; }

This program should return 0, returns 0 when test1.cc and test2.cc are both compiled with GCC, returns 0 when test1.cc and test2.cc are both compiled with clang, but returns 1 when one is compiled with GCC and the other with clang.

In more realistic cases where the different layout would cause problems, although not by design, the different naming has the benefit that it generally either avoids the problem by making GCC's and clang's types different, or clearly diagnoses it by causing a linker error. By aligning the names, we would get more cases where GCC and clang silently generate incompatible code that never gets diagnosed, but causes problems at runtime. In my opinion, it would be a bad idea to align mangling between GCC and clang until the types are actually compatible.

Cheers,
Harald van Dijk

Redirect to cfe-dev.

Hi,

I noticed that when compiling lambda functions, the generated function
names use different conventions than GCC.
Example: https://godbolt.org/z/5qvqKqEe6
The lambda in Clang is named "_Z3barIZ3foovE3$_0EvT_", while the one
in GCC is named "_Z3barIZ3foovEUlvE_EvT_". Their demangled names are
also different ("void bar<foo()::$_0>(foo()::$_0)" vs "void
bar<foo()::{lambda()#1}>(foo()::{lambda()#1})").
Lambdas are not covered by the ABI so this is OK.
However there are use-cases where I find it very inconvenient when
they generate different names. For example, if we are to compare the
performance difference of the same software compiled under Clang and
GCC, the perf stack traces will look very different because of the
naming differences, making it hard to compare.
Is there any particular reason that Clang uses a different naming
convention for lambdas, and would there be push-backs if we were to
make it consistent with GCC?
Thanks.

Hi,

The layout of lambdas is different between GCC and clang. This already results in wrong code when some object files are compiled with GCC and others with clang in some corner cases:

test.h:

extern int i;
extern int f();

inline auto lambda = [](){
int &ref = i;
return [&]() { return ref; };
}();

test1.cc:

#include "test.h"
int f() { return sizeof lambda; }
int i;

test2.cc:

#include "test.h"
int main() { return f() != sizeof lambda; }

This program should return 0, returns 0 when test1.cc and test2.cc are both compiled with GCC, returns 0 when test1.cc and test2.cc are both compiled with clang, but returns 1 when one is compiled with GCC and the other with clang.

In more realistic cases where the different layout would cause problems, although not by design, the different naming has the benefit that it generally either avoids the problem by making GCC's and clang's types different, or clearly diagnoses it by causing a linker error. By aligning the names, we would get more cases where GCC and clang silently generate incompatible code that never gets diagnosed, but causes problems at runtime. In my opinion, it would be a bad idea to align mangling between GCC and clang until the types are actually compatible.

There is a bug tracking this issue now: https://bugs.llvm.org/show_bug.cgi?id=50209
I think it might be good to move the discussion there so it's easier to track.

-Tom

Redirect to cfe-dev.

Hi,

I noticed that when compiling lambda functions, the generated
function

names use different conventions than GCC.
Example: https://godbolt.org/z/5qvqKqEe6
<https://godbolt.org/z/5qvqKqEe6>

The lambda in Clang is named “_Z3barIZ3foovE3$0EvT”, while the one
in GCC is named “Z3barIZ3foovEUlvE_EvT”. Their demangled names are
also different (“void bar<foo()::$_0>(foo()::$_0)” vs “void
bar<foo()::{lambda()#1}>(foo()::{lambda()#1})”).
Lambdas are not covered by the ABI so this is OK.

Actually, they are. See 5.1.8 of the ABI doc
(https://github.com/itanium-cxx-abi/cxx-abi
<https://github.com/itanium-cxx-abi/cxx-abi>)

The reason is that these symbols do escape into object files with
external linkage (not something originally anticipated).

Could you provide a quick example of this ABI break (where two files
compiled with matching compiler (GCC or Clang) link/run correctly, but
mismatching in either direction fails to link/run correctly)?

hm, it turned out to be not quite the case I was thinking of:

// header.h
template int bar (T) {static int i; return i++; }

// the following lambda is attached to ‘ctr’, and therefore the
// same type in every TU, you need gcc >= 10 to get this right.
// not sure if clang models that correctly (given the template
// mangling bug). And yes, this idiom exists in header files
// – I’m looking at you, ranges library :slight_smile:
auto ctr = {};

// TUa.cc
#include “header.h”

// these instantiations are Z3barIN3ctrMUlvE_EEiT due to the
// above-mentioned attachment
int maker1 () { return bar (ctr); }

// TUb.cc
#include “header.h”
int maker2 () { return bar (ctr); }

Those maker[12] functions are calling the exact same bar instantiation,

I don’t think that’s right – this program contains an ODR violation due to redefinition of ‘ctr’ (with two different types) in TUa and TUb.

There is no ABI requirement to use a specific mangling for entities that are not externally visible, and Clang takes advantage of this to avoid strictly numbering lambdas that are only visible in a single TU. Though it’s unfortunate that we put a ‘$’ in the name in such cases; that seems to confuse some demanglers that incorrectly think that identifiers can contain only [A-Za-z0-9_], and results in a demangling that doesn’t mention that we are inside a lambda.

     >
     > > Redirect to cfe-dev.
     > >
     > >>
     > >> Hi,
     > >>
     > >> I noticed that when compiling lambda functions, the generated
     > function
     > >> names use different conventions than GCC.
     > >> Example: https://godbolt.org/z/5qvqKqEe6
    <https://godbolt.org/z/5qvqKqEe6>
     > <https://godbolt.org/z/5qvqKqEe6
    <https://godbolt.org/z/5qvqKqEe6>>
     > >> The lambda in Clang is named "_Z3barIZ3foovE3$_0EvT_",
    while the one
     > >> in GCC is named "_Z3barIZ3foovEUlvE_EvT_". Their
    demangled names are
     > >> also different ("void bar<foo()::$_0>(foo()::$_0)" vs "void
     > >> bar<foo()::{lambda()#1}>(foo()::{lambda()#1})").
     > >> Lambdas are not covered by the ABI so this is OK.
     >
     > Actually, they are. See 5.1.8 of the ABI doc
     > (https://github.com/itanium-cxx-abi/cxx-abi
    <https://github.com/itanium-cxx-abi/cxx-abi>
     > <https://github.com/itanium-cxx-abi/cxx-abi
    <https://github.com/itanium-cxx-abi/cxx-abi>>)
     >
     > The reason is that these symbols do escape into object files with
     > external linkage (not something originally anticipated).
     >
     > Could you provide a quick example of this ABI break (where two files
     > compiled with matching compiler (GCC or Clang) link/run
    correctly, but
     > mismatching in either direction fails to link/run correctly)?

    hm, it turned out to be not quite the case I was thinking of:

    // header.h
    template<typename T> int bar (T) {static int i; return i++; }

    // the following lambda is attached to 'ctr', and therefore the
    // same type in every TU, you need gcc >= 10 to get this right.
    // not sure if clang models that correctly (given the template
    // mangling bug). And yes, this idiom exists in header files
    // -- I'm looking at you, ranges library :slight_smile:
    auto ctr = [](){};

    // TUa.cc
    #include "header.h"

    // these instantiations are _Z3barIN3ctrMUlvE_EEiT_ due to the
    // above-mentioned attachment
    int maker1 () { return bar (ctr); }

    // TUb.cc
    #include "header.h"
    int maker2 () { return bar (ctr); }

    Those maker[12] functions are calling the exact same bar instantiation,

I don't think that's right -- this program contains an ODR violation due to redefinition of 'ctr' (with two different types) in TUa and TUb.

Oh, I forgot to make 'ctr' inline -- that would remove that ODR, right? (Or am I just confusing trauma from header-units :slight_smile: )

Redirect to cfe-dev.

Hi,

I noticed that when compiling lambda functions, the generated
function

names use different conventions than GCC.
Example: https://godbolt.org/z/5qvqKqEe6
<https://godbolt.org/z/5qvqKqEe6>
<https://godbolt.org/z/5qvqKqEe6
<https://godbolt.org/z/5qvqKqEe6>>

The lambda in Clang is named “_Z3barIZ3foovE3$0EvT”,
while the one

in GCC is named “Z3barIZ3foovEUlvE_EvT”. Their
demangled names are

also different (“void bar<foo()::$_0>(foo()::$_0)” vs “void
bar<foo()::{lambda()#1}>(foo()::{lambda()#1})”).
Lambdas are not covered by the ABI so this is OK.

Actually, they are. See 5.1.8 of the ABI doc
(https://github.com/itanium-cxx-abi/cxx-abi
<https://github.com/itanium-cxx-abi/cxx-abi>
<https://github.com/itanium-cxx-abi/cxx-abi
<https://github.com/itanium-cxx-abi/cxx-abi>>)

The reason is that these symbols do escape into object files with
external linkage (not something originally anticipated).

Could you provide a quick example of this ABI break (where two files
compiled with matching compiler (GCC or Clang) link/run
correctly, but
mismatching in either direction fails to link/run correctly)?

hm, it turned out to be not quite the case I was thinking of:

// header.h
template int bar (T) {static int i; return i++; }

// the following lambda is attached to ‘ctr’, and therefore the
// same type in every TU, you need gcc >= 10 to get this right.
// not sure if clang models that correctly (given the template
// mangling bug). And yes, this idiom exists in header files
// – I’m looking at you, ranges library :slight_smile:
auto ctr = {};

// TUa.cc
#include “header.h”

// these instantiations are Z3barIN3ctrMUlvE_EEiT due to the
// above-mentioned attachment
int maker1 () { return bar (ctr); }

// TUb.cc
#include “header.h”
int maker2 () { return bar (ctr); }

Those maker[12] functions are calling the exact same bar instantiation,

I don’t think that’s right – this program contains an ODR violation due
to redefinition of ‘ctr’ (with two different types) in TUa and TUb.

Oh, I forgot to make ‘ctr’ inline – that would remove that ODR, right?
(Or am I just confusing trauma from header-units :slight_smile: )

Yes, making ‘ctr’ inline would address that. It also causes Clang to mangle the lambda following the ABI rule. Er… except, with that change:

GCC mangles i as _ZZ3barIN3ctrMUlvE_EEiT_E1i
Clang mangles i as _ZZ3barI3ctrMUlvE_EiT_E1i

The ABI document doesn’t say how to mangle this case at all; the relevant mangling rule seems (surprisingly) to be , but that can only appear after a , which we don’t have here. So it’s unclear whether this should be mangled as or as N E.

I suppose we have to pick the latter (and that’s what demanglers expect) because …I3ctrM… could also be …<ctr, (some pointer-to-member type)… – so it looks like that’s a Clang ABI bug and a bug in the ABI document :slight_smile: I’ve added a description of the latter to https://github.com/itanium-cxx-abi/cxx-abi/issues/94.