Loads/Stores and MachineMemOperand

I want to get some clarification on the exact semantics of the MachineMemOperand attached to memory-touching instructions. From what I understand, a MemSDNode has an associated MachineMemOperand and a MachineInstr can have zero or more attached MachineMemOperands.

But what is the guarantee/constraint placed on optimization/codegen passes for maintaining the contents of a MachineMemOperand? In particular, a MachineMemOperand has a Value associated with it for the original LLVM IR pointer, but is there any guarantee that this will be valid for all memory-touching instructions after isel and post-isel optimization? I found the following code in StackColoring that seems to indicate that one should not rely on the Value* in a MachineMemOperand to get at pointer information like address space during instruction printing since it may be NULL.

518 if (!V || !isa(V)) {
519 // Clear mem operand since we don’t know for sure that it doesn’t
520 // alias a merged alloca.
521 MMO->setValue(0);
522 continue;
523 }

Is this just a deficiency in the optimizer, or is there no guarantee that MachineMemOperand will retain a valid Value* instance through-out its lifetime?

Thanks for finding this.

I've been trying to find out why multiple address space support breaks/bugs with llvm 3.2 and this looks like it is the reason for that;
Some memory operations lose the information about their address space which is stored in the value of the MachineMemOperand.

Clearing the value seems to be a very nasty thing to do, what is the meaning of this code?

The code itself makes sense, but I want to know if this breaks any guarantee made about preserving a Value* in the MachineMemOperand. It sounds like we’re having the same issue. We were using the Value* stored in the MachineMemOperand to get address space information during assembly printing. The alternative is carrying around a lot of extra (redundant) information in the SDAG.

If it is legal to clear the Value* instead of replacing it with something else, perhaps we can add the address space as another field to the MachineMemOperand. If the Value* gets cleared, at least that would still be available, and I cannot imagine any transformation that would cause the pointer address space to change.

Assuming it is not guaranteed that Value* will remain non-NULL due to alias analysis, the attached patch caches the pointer address space in MachinePointerInfo so it is still available if Value* is NULL-ed out.

Any issues with this approach?

0001-Persist-pointer-address-space-in-MachinePointerInfo-.patch (3.79 KB)

The MMOs provide extra, optional information that late optimizers may use to combine or reorder memory operations.

In particular, stripping all MMOs does not break the semantics of the program, it just removes some opportunities for optimization.

A load or store without an MMO should be treated as if it were volatile.

This means you probably can't use MMOs for reliable address space information.


Is this documented somewhere?

It seems that it isn't. It ought to go in the header file.


Is there a reason MachineMemOperands are not guaranteed to be persisted in late optimization passes? Is there a use-case where they should be stripped?

That's not really the issue, though. As an intermediate representation, MI should be reasonably self-contained. The MMOs are pointers into an old version of the program being compiled - The MI representation has undergone many transformations, including CFG changes and code motion. The links provided by the MMOs get more and more sketchy as the program is optimized.

Currently, we only use the MMOs for alias analysis during scheduling, but even that can cause problems as you've seen with the stack coloring pass.

If you are attaching specific semantics to address spaces, you should encode it in either opcodes or explicit operands.


My interpretation: if correct code generation requires the exact address space, they should be directly represented in the instruction (as opcode or explicit operand).

It's fine to use the MMO for optimization, such as alias analysis, however important they may be on your target, as long as some conservative fall back exists (e.g. unknown address space aliases with all others).