Missing dereferencable flag on loads that previously had them

Hi,

We’ve noticed some regressions as a result of removing MODereferencable from loads in visitLoad from SelectionDAGBuilder.cpp. The problematic code is this condition from Loads.cpp:

    if (!GEP->accumulateConstantOffset(DL, Offset) || Offset.isNegative() ||
        !Offset.urem(APInt(Offset.getBitWidth(), Alignment.value()))
             .isMinValue())

What we’re seeing is that our loads fail to get the dereferencable flag for 2 reasons:

  • The Loop Strength Reduction pass is moving the pointer forward, causing the offset to look negative, even though it’s not. Notice that -15 is added to the pointer from constant + 15 (I’ve added ** in the code for emphasis)
  %uglygep21 = getelementptr i8, i8* bitcast (i16* getelementptr inbounds ([78 x [16 x i16]], [78 x [16 x i16]]* @data_index, i32 0, i32 0, i32 **15**) to i8*), i32 %lsr.iv ; uid:1984
  %uglygep2122 = bitcast i8* %uglygep21 to i16* ; uid:1985
  %scevgep = getelementptr i16, i16* %uglygep2122, i32 **-15** ; uid:1986
  • The load comes from PHI, so we don’t see that the pointer is coming from a constant.

In clang15 (or prior to bb70b5d40652207c0bd3d385def10ef3ef1d45b4), the dereferencable flag was applied without calling isDereferenceableAndAlignedPointer. This allowed us to use vectorization on the loads.

This is an example of the code we want to be able to select into dereferencable loads: https://godbolt.org/z/qde9Meeqs

Does anyone know how to address this?

@arsenm - the commits that changed setting MODereferenceable this are on your name, are you aware of this issue?

Best regards,

Betzalel

Nope, not aware. I don’t understand the negative offset check, it was introduced here

The old low bit masking code makes more sense to me. If you restore that does it fix your issue?

Hi,

Thanks for your help.

I tried to restore the old code, and I’m still having issues. The difference is that the old code uses this calculation:

(Offset + LoadSize).ule(DL.getTypeAllocSize(BaseType)) 

When the Offset is negative, this calculation will always return false. This is probably what made it simpler to just calculate if offset is negative.

The problem is that the offset we calculate in the inner vector can’t be saved and used when we get to the getelementptr on the outside (at %scevgep). At %uglygep21, BaseType has a getTypeAllocSize of 1, and at %scevgep it’s 2, and the -15 will be considered outside of the allocated area even if it were positive.

Basically, it seems like we need a way of transferring both the inner base type and the total accumulated offset so we can use them in this calculation.

Take a step back; what information do we actually need to prove it’s dereferenceable? We need to prove the possible values of the total offset are in range of the variable. Given a GEP with a variable offset, we need to compute possible values of that variable. And in non-trivial cases, that basically requires SCEV, so if we want that information, we probably need to record it before we get to SelectionDAG.

It’s not clear to me why you need “dereferenceable” in this context. It’s usually only necessary to speculate loads, but the given example doesn’t have any conditional control flow.