Non-merged type info name comparison

In https://reviews.llvm.org/rL361913, libc++ gained the option to make type info comparisons use a strcmp on the type info name, which is useful for when we can’t guarantee RTTI uniqueness. I believe libstdc++ also defaults to strcmp comparisons. However, libstdc++ doesn’t perform the string comparison when the type info name begins with the * character [1], which is the case for e.g. types defined in anonymous namespaces. Should libc++’s implementation be changed to match, at least when targeting Linux?

[1] https://github.com/gcc-mirror/gcc/blob/277b02e227df91c686e7f7ad1ae21cd898611ca8/libstdc%2B%2B-v3/libsupc%2B%2B/typeinfo#L123

Please don’t post links/snippets of GPL code here; we can’t look at them for licensing reasons.

– Marshall

Ah, sorry – my apologies.

Ah, sorry – my apologies.

From: Marshall Clow <mclow.lists@gmail.com>
Date: Monday, September 23, 2019 at 12:14 PM
To: Shoaib Meenai <smeenai@fb.com>
Cc: "libcxx-dev@lists.llvm.org" <libcxx-dev@lists.llvm.org>
Subject: Re: [libcxx-dev] Non-merged type info name comparison

In rG2405bd689815 <rG2405bd689815, libc++ gained the option to make type info comparisons use a strcmp on the type info name, which is useful for when we can’t guarantee RTTI uniqueness. I believe libstdc++ also defaults to strcmp comparisons. However, libstdc++ doesn’t perform the string comparison when the type info name begins with the * character [1], which is the case for e.g. types defined in anonymous namespaces. Should libc++’s implementation be changed to match, at least when targeting Linux?

Actually, in what cases can we guarantee that RTTI has been fully duplicated? For example, I don't think anything prevents users from using a shared library that redefines the RTTI for a type, no?

I've had a couple of complaints of people saying we should do the string comparison -- I wonder whether we might have put performance above correctness in this case. My reasoning is that if we don't have a valid use case under which RTTI is guaranteed to be unique, we should do the string comparison. And if we do have such a use case, we should evaluate how prominent it is to determine whether we do the slow-but-correct approach by default, or the fast-but-sometimes-incorrect approach by default.

Louis

Do you mean deduplicated?

If so, then in binaries that have everything (including libc++) statically linked, this is guaranteed.

Ah, sorry – my apologies.

From: Marshall Clow <mclow.lists@gmail.com>
Date: Monday, September 23, 2019 at 12:14 PM
To: Shoaib Meenai <smeenai@fb.com>
Cc:libcxx-dev@lists.llvm.org” <libcxx-dev@lists.llvm.org>
Subject: Re: [libcxx-dev] Non-merged type info name comparison

In https://reviews.llvm.org/rL361913, libc++ gained the option to make type info comparisons use a strcmp on the type info name, which is useful for when we can’t guarantee RTTI uniqueness. I believe libstdc++ also defaults to strcmp comparisons. However, libstdc++ doesn’t perform the string comparison when the type info name begins with the * character [1], which is the case for e.g. types defined in anonymous namespaces. Should libc++’s implementation be changed to match, at least when targeting Linux?

Actually, in what cases can we guarantee that RTTI has been fully duplicated? For example, I don’t think anything prevents users from using a shared library that redefines the RTTI for a type, no?

Do you mean deduplicated?

Yes, sorry!

If so, then in binaries that have everything (including libc++) statically linked, this is guaranteed.

Right. So some platforms (I guess Android) have this property, and other platforms (like Apple platforms, where we don’t link libc++ statically into applications) do not. I guess it’s then a choice to be made by vendors, and the status quo is OK.

Louis

Revisiting this, I think it would make sense to do that. The three modes would become:

1) Unique: Assume RTTI is de-duplicated, and only ever compare pointers -- AS-IS TODAY
2) NonUnique: Assume RTTI is not de-deduplicated: compare pointers first, fall back to string comparison but not when it starts with * -- THIS IS DIFFERENT FROM TODAY
3) NonUniqueARMRTTIBit: Weird implementation for ARM64 -- DOESN'T CHANGE

Eric, what do you think? It would be straightforward to implement, and it would unbreak the NonUnique variant in the cases where we have identically named types in anonymous namespaces -- an arguable bug in the NonUnique implementation today.

Louis

One thing I’ve since discovered is that clang doesn’t produce the leading * for anonymous namespace types either, so that would be the other half of such a change. See https://llvm.org/PR34907