RTTI name generated for classes in anonymous namespaces do not begin with the prefix asterisk(*). Due to which type_info::operator== operator fails on GNU/linux (libstdc++ library) when compared to gcc.
is this divergence between clang and gcc accepted behavior?
Above problem is found in the issues listed below:
opened 03:14PM - 02 Jan 23 UTC
clang:frontend
```c++
#include <typeinfo>
namespace {
struct my_x {
static int member;…
};
int my_x::member;
}
int* get_member() { return &my_x::member; }
const std::type_info* inf() { return &typeid(my_x); }
```
```c++
#include <typeinfo>
namespace {
struct my_x {
static int member;
};
int my_x::member;
}
int* get_member();
const std::type_info* inf();
int* get_member2() { return &my_x::member; }
const std::type_info* inf2() { return &typeid(my_x); }
#include <stdio.h>
int main()
{
printf("%p %p\n", get_member(), get_member2());
printf("%p %p\n", inf(), inf2());
printf("%d\n", get_member() == get_member2());
printf("%d\n", *inf() == *inf2());
}
```
https://godbolt.org/z/Pjnz6jbGb
This prints four pointers, then 0 1.
I believe that is an illegal output. Either the two my_x are the same, and it should print 1 1, or they're different, and it should print 0 0. I think 0 0 is the correct answer, but I'm not very familiar with this corner of the C++ specification.
Similar to #10492 - that one's long gone, but maybe a similar fix would apply here. (GCC has a similar bug, but only with -fmerge-all-constants. edit: And that one's documented as giving noncompliant behavior, so I'm not sure if it is a bug.)
opened 09:10PM - 10 Oct 17 UTC
clang:codegen
bugzilla
| | |
| --- | --- |
| Bugzilla Link | [34907](https://llvm.org/bz34907) |
| Ve… rsion | 5.0 |
| OS | Linux |
| CC | @apolukhin,@k15tfu,@zygoloid,@rjmccall,@smeenai |
## Extended Description
libstdc++ apparently has a convention where the typeinfo name for a class declared in an anonymous namespace begins with an asterisk ('*'), which tells std::type_info::operator== to consider two type_info objects unequal even if their names are equal. Clang is not outputting this asterisk on GNU/Linux. Because it's omitted, if I declare two classes with the same name, in two different anonymous namespaces, the two class types are considered equal according to std::type_info::operator==, and I can cast from one type to another with dynamic_cast. G++ outputs the asterisk, so the types are treated as unequal.
The asterisk is stripped off in GNU's std::type_info::name(), so it's not user visible.
AFAICT, libc++ doesn't have this convention, but for ARM64 iOS, there is a different convention of setting the highest(?) bit of the type_info's __type_name pointer to indicate that string comparison *should* be performed. (Look for the _LIBCPP_HAS_NONUNIQUE_TYPEINFO and _LIBCPP_NONUNIQUE_RTTI_BIT flags in libc++. I wonder if ARM64 iOS also sets _LIBCXX_DYNAMIC_FALLBACK for libc++abi?)
I'm wondering whether there's a compatibility concern here w.r.t. previous versions of Clang. My first guess is that compatibility with G++/libstdc++/libsupc++ (and correctness) is sufficient to motivate changing Clang. I guess Clang would have to generate different code for -stdlib=libstdc++ and -stdlib=libc++?
Test case:
test.h
#include <typeinfo>
#include <stddef.h>
#include <stdio.h>
struct Base {
virtual ~Base() {}
};
namespace def {
Base *alloc();
const std::type_info &type();
}
test-def.cc
#include "test.h"
namespace {
struct A : Base {};
}
namespace def {
Base *alloc() {
return new A;
}
const std::type_info &type() {
return typeid(A);
}
}
test-run.cc
#include "test.h"
namespace {
struct A : Base {
void func() {
printf("ERROR: run func called, field=%d\n", field);
}
private:
int field = 42;
};
}
__attribute__((noinline))
static A *do_cast(Base *b) {
return dynamic_cast<A*>(b);
}
__attribute__((noinline))
static bool types_eq(const std::type_info &x, const std::type_info &y) {
return x == y;
}
int main() {
printf("def A == run A: %d\n", types_eq(def::type(), typeid(A)));
printf("&def A == &run A: %d\n", &def::type() == &typeid(A));
printf("name of def A: %s\n", def::type().name());
printf("name of run A: %s\n", typeid(A).name());
printf("def A name == run A name: %d\n", def::type().name() == typeid(A).name());
Base *b = def::alloc();
auto *p = do_cast(b);
if (p == nullptr) {
printf("SUCCESS: dynamic_cast returned nullptr\n");
} else {
p->func();
}
#ifdef __GXX_TYPEINFO_EQUALITY_INLINE
printf("__GXX_TYPEINFO_EQUALITY_INLINE = %d\n", __GXX_TYPEINFO_EQUALITY_INLINE);
#endif
#ifdef __GXX_MERGED_TYPEINFO_NAMES
printf("__GXX_MERGED_TYPEINFO_NAMES = %d\n", __GXX_MERGED_TYPEINFO_NAMES);
#endif
}
$ cat /etc/issue
Ubuntu 14.04.5 LTS \n \l
$ uname -m
x86_64
$ g++ test-def.cc test-run.cc -std=c++11 && ./a.out
def A == run A: 0
&def A == &run A: 0
name of def A: N12_GLOBAL__N_11AE
name of run A: N12_GLOBAL__N_11AE
def A name == run A name: 0
SUCCESS: dynamic_cast returned nullptr
__GXX_TYPEINFO_EQUALITY_INLINE = 1
__GXX_MERGED_TYPEINFO_NAMES = 0
$ ~/clang+llvm-5.0.0-linux-x86_64-ubuntu14.04/bin/clang++ test-def.cc test-run.cc -std=c++11 && ./a.out
def A == run A: 1
&def A == &run A: 0
name of def A: N12_GLOBAL__N_11AE
name of run A: N12_GLOBAL__N_11AE
def A name == run A name: 0
ERROR: run func called, field=0
__GXX_TYPEINFO_EQUALITY_INLINE = 1
__GXX_MERGED_TYPEINFO_NAMES = 0
$ g++ test-def.cc -S && cat test-def.s
...
_ZTSN12_GLOBAL__N_11AE:
.string "*N12_GLOBAL__N_11AE"
...
$ ~/clang+llvm-5.0.0-linux-x86_64-ubuntu14.04/bin/clang++ test-def.cc -S && cat test-def.s
...
_ZTSN12_GLOBAL__N_11AE:
.asciz "N12_GLOBAL__N_11AE"
...
Per IA64 ABI: RTTI name for class in anonymous namespace lacks '*', breaks dynamic_cast and type_info::operator== on GNU/Linux · Issue #34255 · llvm/llvm-project · GitHub , I’m pretty sure this is an unintended behavior divergence from gcc/libstdc++ behavior. The concern has been around what to do about backward compatibility though.
1 Like
I agree, I think John’s comment on the issue captures it nicely: IA64 ABI: RTTI name for class in anonymous namespace lacks '*', breaks dynamic_cast and type_info::operator== on GNU/Linux · Issue #34255 · llvm/llvm-project · GitHub
Naively, I think we should match the same ABI (per target) by default and then use an ABI tag to allow users to get back to the old ABI if they need to for some reason.