Assigning constant value without alloca/load/store

Hello,

I am currently trying to translate some custom IR to LLVM-IR and came across and issue.
The custom IR has several registers and I am basically try to SSAfy it so it can be easily translated/converted to LLVM-IR.

The problem:

Since in my custom IR I can reassign every register I have to reassign every new expression with a new llvm Value. But my IR has something like this:

REG A = VAR C + CONST 2
REG A = CONST 12

So my workaround looks like:

; I am returning the registers in an anonymous struct

define { i32, i32, i32 } @test(i32 %var_c) {
; Initializing registers
%reg_a_0 = select i1 true, i32 0, i32 0
%reg_b_0 = select i1 true, i32 0, i32 0
%reg_c_0 = select i1 true, i32 0, i32 0

; Translated instructions
%reg_a_1 = add i32 %var_c, 2
%reg_a_2 = select i1 true, i32 12, i32 0

; Prepare return values
%ret_0 = insertvalue { i32, i32, i32 } undef, i32 %reg_a_2, 0
%ret_1 = insertvalue { i32, i32, i32 } %ret_0, i32 %reg_b_0, 1
%ret_2 = insertvalue { i32, i32, i32 } %ret_1, i32 %reg_c_0, 2

ret { i32, i32, i32 } %ret_2
}

I am basically using “select i1 true, i32 1, i32 0” so after optimization it gets:
%val = i32 1

But as I said this looks like a hack to me and I can’t simply use “%val = i32 1”.
So what’s the proper way to do this without actually using alloca/load/store.

Regards,
Paul

Hi!

I don’t know what “the right way” to do this is, but is there any reason you’re against just using alloca/load/store? That’s what clang emits for most locals/parameters/…, and LLVM tends to do a very good job of SSAifying that where it can. :slight_smile:

George

Hello,

I am currently trying to translate some custom IR to LLVM-IR and came
across and issue.
The custom IR has several registers and I am basically try to SSAfy it so
it can be easily translated/converted to LLVM-IR.

The problem:

Since in my custom IR I can reassign every register I have to reassign
every new expression with a new llvm Value. But my IR has something like
this:

REG A = VAR C + CONST 2
REG A = CONST 12

So my workaround looks like:

; I am returning the registers in an anonymous struct
define { i32, i32, i32 } @test(i32 %var_c) {
  ; Initializing registers
  %reg_a_0 = select i1 true, i32 0, i32 0
  %reg_b_0 = select i1 true, i32 0, i32 0
  %reg_c_0 = select i1 true, i32 0, i32 0

  ; Translated instructions
  %reg_a_1 = add i32 %var_c, 2
  %reg_a_2 = select i1 true, i32 12, i32 0

  ; Prepare return values
  %ret_0 = insertvalue { i32, i32, i32 } undef, i32 %reg_a_2, 0
  %ret_1 = insertvalue { i32, i32, i32 } %ret_0, i32 %reg_b_0, 1
  %ret_2 = insertvalue { i32, i32, i32 } %ret_1, i32 %reg_c_0, 2

  ret { i32, i32, i32 } %ret_2
}

I am basically using "select i1 true, i32 1, i32 0" so after optimization
it gets:
%val = i32 1

But as I said this looks like a hack to me and I can't simply use "%val =
i32 1".
So what's the proper way to do this without actually using
alloca/load/store.

You can use trivial bitcasts if you want to avoid load/store/alloca and you
want SSA variables for your constants:
  %val = bitcast i32 1 to i32

IMHO, don’t try to tell llvm about your registers, instead create a mapping between the last Value* assigned to each register for each basic block.

For example, here’s a crude suggestion.

First pass, identify all of the branch destinations from your custom IR, where you will need to create an llvm basic block.

Second pass, translate IR instructions into llvm IR for each block.

  • keep track of the last Value* stored in each register.

  • if you need to load a Value* from a register that hasn’t been assigned yet in this block, create a phi node and insert it at the start of the block.

Third pass, follow branches and link up phi nodes.

Or you could simply generate code as if each register is on the stack. Generating alloca, load & store. Then run the mem2reg pass to promote everything back to virtual registers for you automatically. That’s probably easier to get right.

This looks much better than using “select i1 true, i32 1, i32 0”.
But since something like that (bitcast constant) is allowed why isn’t it possible to simply do this:
%val = i32 1
?

I want to keep the translation short and simple (My IR doesn’t have control flow so it’s basically on basic block) that’s why I don’t want to rely on alloca/load/store.

Is there some particular reason that short IR is better than “follow what others do”?

I have written a (reasonably complete) Pascal compiler, and at first I worried about calling alloca, but I found that the mem2reg does a very good job of getting rid of alloca’s, so it’s not really a problem [you just need to ensure that all alloca’s happen at the start of the function, or weird things happen, from what I’ve heard].

But since something like that (bitcast constant) is allowed why isn’t it possible to simply do this

The following code

%1 = i32 1
%2 = i32 2
%3 = add i32 %1, %2

is more or less equivalent to

%1 = add i32 1, 2

Because, in most cases, Instructions don’t care if they’re given SSA Values or Constants as operands. The second version is more compact, more concise, and has less indirection than the first, so it’s presumably better in the vast majority of cases.

Is there something you’re trying to work around with the ‘Initializing registers’ bit of your initial example? ISTM that doing so would be entirely unnecessary if you took Jeremy’s approach (well, a simplified version of it; you’d only need to do most of step 2 if you’re guaranteed to only have a single basic block).

Because %val doesn’t exist (really!).
In the actual IR, left hand sides do not exist.

The Value * (i32 1 in this case) is really linked directly into it’s users as a use.

So something like

declare foo(%a, %b)

%c = add i32 %a, %b
%d = add i32 %ic, %a
ret i32 %d

really looks like this:

ret <operand points to add i32 <operand points to add i32 <operand points to argument %a>, <operand points to argument %b>>, <operand points to argument %a>>

To give a statement ordering and basic blocks, they are intrusive-linked-list’d together into basic blocks. But the LHS still doesn’t exist, even in that form.

There are no “declaration” operations, so you can’t have an instruction that is just a value itself because it’s only point would be to produce an LHS, and as we said, LHS’s don’t really exist.

Instead, because Value * supports both SSA values and constants, every user would just have a use pointing to an i32 1 directly.

So in this case, the fake LHS abstraction llvm assembly provides is just being exposed a bit.
In producing the IR, anything that isa, you should just use directly as an operand to wherever it needs to go.