DenseMap/ValueMap: is M[New]=M[Old] valid ?

Hi,

I have a question about llvm/ADT/DenseMap.h and llvm/IR/ValueMap.h:

When you have a:
     MapType M;

is it safe to do:
    M[NewKey] = M[OldKey];

or do you need to do it in two steps:
   auto tmp = M[OldKey]; // ensure the reference to M[OldKey] is copied, before reassigning.
   M[NewKey] = tmp; // might reallocate

aka, will a possible allocation for M[NewKey] invalidate the reference that M[OldKey] returns ?

Greetings,

Jeroen Dobbelaere

Yep - You’d have to separate them for correctness if “NewKey” might not already be in “M”.

I actually ran into an elusive bug with this exact same issue some time ago as well, see https://bugs.llvm.org/show_bug.cgi?id=42065 and https://reviews.llvm.org/D62624.

The strange thing about that bug was that it only showed up if built with GCC; Clang dereferenced the right hand side reference before evaluating the left hand side, as long as the value type was as small as the reference itself.

// Martin

I’d be surprised if Clang or GCC’s behavior here varied depending on the size of anything, but maybe?

In any case, C++17 or so requires the RHS to be evaluated before the LHS for assignments - so this is now /always/ wrong, not just unspecified (which, I guess, also always wrong… just sometimes accidentally right) as it was before.

It's not a question about when the operands are evaluated - both do evaluate them in the same order. But evaluating the left and right hand side leaves you with two references. Then to do the assignment, you dereference the right hand side reference and assign it to the dereferenced left hand side reference.

If the value type is as small as the reference itself, Clang dereferences the right hand side reference before evaluating the left hand side.

The testcase in https://bugs.llvm.org/show_bug.cgi?id=42065 allows you to play around with different variants of this; small or large value type, and different forms of the assignment expression (where one form of the expression helped for Clang but not for GCC).

The core part of the testcase is this:

void copy(void *a, void *b) {
         GlobalMap[a] = GlobalMap[b];
}

When compiled with Clang, ends up like this:

define void @_Z4copyPvS_(i8*, i8*) #1 {
   %3 = alloca i8*, align 8
   %4 = alloca i8*, align 8
   store i8* %0, i8** %3, align 8
   store i8* %1, i8** %4, align 8
   %5 = call dereferenceable(8) i8** @_ZN4llvm12DenseMapBaseINS_8DenseMapIPvS2_NS_12DenseMapInfoIS2_EENS_6detail12DenseMapPairIS2_S2_EEEES2_S2_S4_S7_EixERKS2_(%"class.llvm::DenseMapBase"* getelementptr inbounds (%"class.llvm::DenseMap", %"class.llvm::DenseMap"* @GlobalMap, i32 0, i32 0), i8** dereferenceable(8) %4)
   %6 = load i8*, i8** %5, align 8
   %7 = call dereferenceable(8) i8** @_ZN4llvm12DenseMapBaseINS_8DenseMapIPvS2_NS_12DenseMapInfoIS2_EENS_6detail12DenseMapPairIS2_S2_EEEES2_S2_S4_S7_EixERKS2_(%"class.llvm::DenseMapBase"* getelementptr inbounds (%"class.llvm::DenseMap", %"class.llvm::DenseMap"* @GlobalMap, i32 0, i32 0), i8** dereferenceable(8) %3)
   store i8* %6, i8** %7, align 8
   ret void
}

Here %6 is the value type loaded from the RHS reference %5, loaded before evaluating the LHS.

If the value type of the map is something larger than just a pointer, the emitted IR looks like this instead:

define void @_Z4copyPvS_(i8*, i8*) #1 {
   %3 = alloca i8*, align 8
   %4 = alloca i8*, align 8
   store i8* %0, i8** %3, align 8
   store i8* %1, i8** %4, align 8
   %5 = call dereferenceable(800) %struct.LargeValue* @_ZN4llvm12DenseMapBaseINS_8DenseMapIPv10LargeValueNS_12DenseMapInfoIS2_EENS_6detail12DenseMapPairIS2_S3_EEEES2_S3_S5_S8_EixERKS2_(%"class.llvm::DenseMapBase"* getelementptr inbounds (%"class.llvm::DenseMap", %"class.llvm::DenseMap"* @GlobalMap, i32 0, i32 0), i8** dereferenceable(8) %4)
   %6 = call dereferenceable(800) %struct.LargeValue* @_ZN4llvm12DenseMapBaseINS_8DenseMapIPv10LargeValueNS_12DenseMapInfoIS2_EENS_6detail12DenseMapPairIS2_S3_EEEES2_S3_S5_S8_EixERKS2_(%"class.llvm::DenseMapBase"* getelementptr inbounds (%"class.llvm::DenseMap", %"class.llvm::DenseMap"* @GlobalMap, i32 0, i32 0), i8** dereferenceable(8) %3)
   %7 = bitcast %struct.LargeValue* %6 to i8*
   %8 = bitcast %struct.LargeValue* %5 to i8*
   call void @llvm.memcpy.p0i8.p0i8.i64(i8* %7, i8* %8, i64 800, i32 8, i1 false)
   ret void
}

In this case, the RHS reference isn't dereferenced until both references are available, and the value can be copied with a memcpy.

// Martin

From: Martin Storsjö <martin@martin.st>

[...]

It's not a question about when the operands are evaluated - both do
evaluate them in the same order. But evaluating the left and right hand
side leaves you with two references. Then to do the assignment, you
dereference the right hand side reference and assign it to the
dereferenced left hand side reference.

[...]

         GlobalMap[a] = GlobalMap[b];

[...]

Is there a checker in clang that can detect this kind of wrong usage ?
If not, how hard would it be to write one ?

Greetings,

Jeroen Dobbelaere

From: llvm-dev [mailto:llvm-dev-bounces@lists.llvm.org] On Behalf Of
Jeroen Dobbelaere via llvm-dev
Sent: Friday, September 27, 2019 5:46 AM
To: Martin Storsjö; David Blaikie
Cc: llvm-dev@lists.llvm.org
Subject: Re: [llvm-dev] DenseMap/ValueMap: is M[New]=M[Old] valid ?

> From: Martin Storsjö <martin@martin.st>
[...]
> It's not a question about when the operands are evaluated - both do
> evaluate them in the same order. But evaluating the left and right hand
> side leaves you with two references. Then to do the assignment, you
> dereference the right hand side reference and assign it to the
> dereferenced left hand side reference.
[...]
> GlobalMap[a] = GlobalMap[b];
[...]

Is there a checker in clang that can detect this kind of wrong usage ?

I think I've tripped over this construct twice; once when [a] was always
supposed to be there already (but wasn't because of a bug elsewhere), and
another time when [a] was never supposed to be there already (which has
the obvious fix). I doubt a checker would have enough context to tell
those apart, but maybe it could tell you to add an assertion (that [a] is
already present) or break up the assignment.
--paulr

huh, that’s super surprising to me. I’d have thought the unoptimized code from the frontend would make it pretty difficult for the middle-end to reorder any of the loads, stores, etc.