Why does IRTranslator align the size for dynamic alloca?

I am interested in how IRTranslator and LegalizerHelper handle alloca, specifically when

  • alloca’s size is dynamic, and
  • alloca’s alignment is stricter than what the target requires for its stack

When this is true, translateAlloca will round the size of the alloca up to the target’s stack alignment here:


bool IRTranslator::translateAlloca(const User &U,
                                   MachineIRBuilder &MIRBuilder) {
...
  // Round the size of the allocation up to the stack alignment size
  // by add SA-1 to the size. This doesn't overflow because we're computing
  // an address inside an alloca.
  unsigned StackAlign =
      MF->getSubtarget().getFrameLowering()->getStackAlignment();
  auto SAMinusOne = MIRBuilder.buildConstant(IntPtrTy, StackAlign - 1);
  auto AllocAdd = MIRBuilder.buildAdd(IntPtrTy, AllocSize, SAMinusOne,
                                      MachineInstr::NoUWrap);
  auto AlignCst =
      MIRBuilder.buildConstant(IntPtrTy, ~(uint64_t)(StackAlign - 1));
  auto AlignedAlloc = MIRBuilder.buildAnd(IntPtrTy, AllocAdd, AlignCst);

  Align Alignment = std::max(AI.getAlign(), DL->getPrefTypeAlign(Ty));
  if (Alignment <= StackAlign)
    Alignment = Align(1);
  MIRBuilder.buildDynStackAlloc(getOrCreateVReg(AI), AlignedAlloc, Alignment);
...

Then, LegalizerHelper aligns the pointer result of DYN_STACKALLOC here:

Register LegalizerHelper::getDynStackAllocTargetPtr(Register SPReg,
                                                    Register AllocSize,
                                                    Align Alignment,
                                                    LLT PtrTy) {
  ...
  if (Alignment > Align(1)) {
    APInt AlignMask(IntPtrTy.getSizeInBits(), Alignment.value(), true);
    AlignMask.negate();
    auto AlignCst = MIRBuilder.buildConstant(IntPtrTy, AlignMask);
    Alloc = MIRBuilder.buildAnd(IntPtrTy, Alloc, AlignCst);
  }

  return MIRBuilder.buildCast(PtrTy, Alloc).getReg(0);
}

In my case Alignment > Align(1) is true because the alignment coming from alloca is stricter than StackAlign. Doesn’t aligning the size of the alloca in IRTranslator become redundant if the resulting pointer gets aligned here in the legalizer either way? If so, I think the codegen improves if we avoid padding the size of the alloca in translateAlloca when it has alignment stricter than the stack alignment.

Otherwise, is there a reason why the size of alloca must be aligned, and not just the pointer result?

I traced this back to the implementation reviewed in D66678.

Thanks.

I think this was copying what SelectionDAGBuilder was doing. I do not think this should be doing this alignment and IRTranslator should faithfully preserve whatever was in the original IR

I see that SelectionDAGBuilder has the same behavior as IRTranslator like you say. Here is a link to the original change: - If a dynamic_stackalloc alignment requirement is <= stack alignment… · llvm/llvm-project@95667c5 · GitHub.

Should I proceed with making a patch that skips the size alignment in these two locations when the alignment exceeds StackAlign?

I believe removing the size alignment completely would require later lowering to enforce the StackAlign (eg in LegalizerHelper).

Yes, that would be good

Here is the PR: [CodeGen] Avoid aligning alloca size. by jcogan-nv · Pull Request #132064 · llvm/llvm-project · GitHub