AVX spill alignment

Hey guys,

Are spills/reloads of AVX registers using aligned stores/loads? I can’t
seem to find the code that aligns the stack slots to 32-bytes. Could
someone point me in the right direction?

Thanks,
Cameron

Hey guys,

Are spills/reloads of AVX registers using aligned stores/loads?

Yes.

I can't
seem to find the code that aligns the stack slots to 32-bytes. Could
someone point me in the right direction?

The register class has 256-bit spill alignment:

def VR256 : RegisterClass<"X86", [v32i8, v16i16, v8i32, v4i64, v8f32, v4f64],
                          256, (sequence "YMM%u", 0, 15)> {
  let SubRegClasses = [(FR32 sub_ss), (FR64 sub_sd), (VR128 sub_xmm)];
}

/jakob

Ah, thanks. That seems easy enough.

Sorry to be pedantic, but does that snippet also handle cases where the frame pointer, %rbp, needs to be 32-byte aligned when dynamic allocas are present?

I’ve looked at the ABI, but I don’t see any guarantees about 32-byte frame alignment for AVX. That can be trouble when spill slots are based off of the frame pointer, not the stack pointer. Please correct me if I’m wrong.

I am working off of a 2.9ish branch, so I would like to cherry pick the changes I need. If it’s too much trouble to pinpoint the source, please say so.

Tx,
Cameron

If spill slots or other stack variables require higher alignment than the ABI guarantees, dynamic stack realignment code should be inserted in the prolog.

The logic to control this is quite complicated, so you may want to verify that it actually works.

/jakob