all LLVM Instructions that may write to memory -- other than StoreInst?

I need to figure out all LLVM Instructions that may write to memory.

In http://llvm.org/docs/tutorial/OCamlLangImpl7.html, it mentions that
“In LLVM, all memory accesses are explicit with load/store instructions, and it is carefully designed not to have (or need) an “address-of” operator.”

I take this as “StoreInst is the only one that writes to memory”.

However, this doesn’t seem to be enough.

Consider:

int a, b, d;
d = a + b;

The above code is turned into LLVM IR:

  %0 = load i32* @a, align 4
  %1 = load i32* @b, align 4
  %2 = add nsw i32 %1, %0
  store i32 %2, i32* @d, align 4

Is it possible that temps such as %0, %1 and/or %2 will NOT being register allocated later in the compilation stage, and thus left in memory?

The above code, when converted back to C level, looks like this:
...
  unsigned int llvm_cbe_tmp__6;
  unsigned int llvm_cbe_tmp__7;
  unsigned int llvm_cbe_tmp__8;
  unsigned int llvm_cbe_tmp__9;

  llvm_cbe_tmp__6 = *(&a);
  llvm_cbe_tmp__7 = *(&b);
  llvm_cbe_tmp__8 = ((unsigned int )(((unsigned int )llvm_cbe_tmp__7) + ((unsigned int )llvm_cbe_tmp__6)));
  *(&d) = llvm_cbe_tmp__8;
  llvm_cbe_tmp__9 =  // printf(((&_OC_str.array[((signed int )0u)])), llvm_cbe_tmp__8);
...

It seems the compiler-generated temps are _actually_ left on stack, and writes to them are actually writes to stack memory (via load, add, ...).

I am confused here.
Could somebody help to clarify it?

Thank you

Chuck

I need to figure out all LLVM Instructions that may write to memory.

In http://llvm.org/docs/tutorial/OCamlLangImpl7.html, it mentions that
“In LLVM, all memory accesses are explicit with load/store instructions, and it is carefully designed not to have (or need) an “address-of” operator.”

I take this as “StoreInst is the only one that writes to memory”.

There are intrinsic functions which write to memory also, such as memcpy.

However, this doesn’t seem to be enough.

Your observation is correct. Strictly speaking, any instruction can write to memory after code generation because it may access a stack spill slot or a function parameter which the ABI places on the stack.

When the Language Reference Manual talks about writing to memory, it is talking about writing to memory that is visible at the LLVM IR level. The stack frame is invisible at the LLVM IR level. Put another way, “memory” is a set of memory locations which can be explicitly accessed by LLVM load and store instructions and are not in SSA form; it is not all of the memory within the computer.

If you’re interested in finding instructions that write to RAM (including writes to stack spill slots), it may be better to work on Machine Instructions within the code generator framework.

– John T.

John,

Thanks for the reply.
I agree with your comments that the “Memory” LLVM Spec refers to doesn’t include stack.

Let me leverage a bit further:

If I need to work on high-level IRs (not machine dependent, not in the code-gen stage), is it reasonable to assume that
ALL LLVM IRs that have a result field will have potential to write stack?

E.g.

  <result> = add <ty> <op1>, <op2>          *; yields {ty}:result*
  br i1 <cond>, label <iftrue>, label <iffalse>
  br label <dest>          *; Unconditional branch*

ADD can (potential) write stack to store its result, while BR will NEVER write stack because its doesn’t have a result.

Thank you

Chuck

John,

Thanks for the reply.
I agree with your comments that the “Memory” LLVM Spec refers to doesn’t include stack.

It includes stack objects (memory allocated by the alloca instruction) but not the stack frame (e.g., spill slots).

Let me leverage a bit further:

If I need to work on high-level IRs (not machine dependent, not in the code-gen stage), is it reasonable to assume that
ALL LLVM IRs that have a result field will have potential to write stack?

Strictly speaking, I would go so far as to assume that any LLVM IR instruction can write to the stack frame.

E.g.

  <result> = add <ty> <op1>, <op2>          *; yields {ty}:result*
  br i1 <cond>, label <iftrue>, label <iffalse>
  br label <dest>          *; Unconditional branch*

ADD can (potential) write stack to store its result, while BR will NEVER write stack because its doesn’t have a result.

You might be able to get away with this on some platforms. However, you can’t assume this in general; the LLVM IR makes no guarantees at all about which instructions read and write the stack frame and which do not. The branch could load its argument from the stack frame or from a global value pool. On a VLIW machine, it could be packed into an instruction that also contains a read/write from/to the stack frame. Maybe the processor only supports indirect branch instructions.

Whether you want to count on LLVM IR branches writing to the stack depends on what hardware architecture you’re using and what you’re doing. If you’re counting memory accesses for a heuristic only on x86, then assuming branches don’t write to memory seems like a reasonable assumption. If you need an accurate count on all supported platforms, I’d look into analyzing the generated machine code.

– John T.

Hi Chuck,

I need to figure out all LLVM Instructions that may write to memory.

I->mayWriteToMemory()

Ciao, Duncan.

Duncan,

I looked at this function even before starting the discussions.

I think it respects LLVM's principle that only Loads and Stores (+VAArg, and maybe Call) that can access global memory, but doesn't address the issues of accessing stack frames or stack objects.

Thank you for the hint.

Chuck

Hi Chuck,

I looked at this function even before starting the discussions.

I think it respects LLVM's principle that only Loads and Stores (+VAArg,
and maybe Call) that can access global memory, but doesn't address the
issues of accessing stack frames or stack objects.

I think you need to look at the code generators, since what you're asking
doesn't make much sense at the IR level. For example pretty much *any*
IR instruction could result in accessing the stack if the code generator
spilled registers to the stack.

Ciao, Duncan.