hi, list,
I found AMDGPUPromoteAlloca calculates newly ptr as follows:
std::tie(TCntY, TCntZ) = getLocalSizeYZ(Builder);
Value *TIdX = getWorkitemID(Builder, 0);
Value *TIdY = getWorkitemID(Builder, 1);
Value *TIdZ = getWorkitemID(Builder, 2);
Value *Tmp0 = Builder.CreateMul(TCntY, TCntZ, "", true, true);
Tmp0 = Builder.CreateMul(Tmp0, TIdX);
Value *Tmp1 = Builder.CreateMul(TIdY, TCntZ, "", true, true);
Value *TID = Builder.CreateAdd(Tmp0, Tmp1);
TID = Builder.CreateAdd(TID, TIdZ);
it assumes that we enable 3 dims already?
actually, it's not the case for SI. SI only enable dim-x
for non-shader programs by default(SIMachinefunctionInfo.cpp) . does it conflict?
thanks,
--lx