Load Instruction that changes value of two registers

Hello,
I’m writing a backend for an architecture that only has LOAD Instructions that first copy the old value of the target register in another register and after that load the provided value into the register.

Example of an addition:
load a, reg1; // → copies old value of reg1 in reg2 and loads value from a into reg1
load b, reg1; // → copies old value of reg1 in reg2 and loads value from b into reg1
add reg1, reg2; // adds values from a and b and saves the result into reg1

So I need to describe the “load X, reg1” Instruction so that LLVM understands it correctly.
How can I do that in LLVM? Where is the best place to do that(TableGen, Instruction Selection, Instruction Lowering)?

It would be fine if I could tell LLVM that reg2 is invalid after a load Operation, but I don#t know how to do that…
I tried the following in TableGen to let LLVM know that Resg2 isn’t valid anymore after a load but it didn’t produce the desired result:

let Defs = [Regs2] in
{

def LD: Inst<(outs Regs1:$dst), (ins MEM:$addr),

“load $addr, $dst;”,

[(set Regs1:$dst, (load addr:$addr))]>;

}

Thanks in advance,
Markus

Hi Markus,

Hello,
I’m writing a backend for an architecture that only has LOAD Instructions that first copy the old value of the target register in another register and after that load the provided value into the register.

If I understand correctly, your load performs in parallel a copy and a load:
loadedVal load addr || copy

If you forget about the copy part, you can simply model your load like this:
, load addr

will be an implicit definition and your good to go.
Obviously, this is not optimal.

Note that, the original code (dot = load addr) does not define , so I guess you have some rules to assign it (like = + 1).
Therefore you may have to create a specific register class for that:
load addr
= BigDstReg.subIdx
And have a pattern using a EXTRACT_SUBREG (see ARM NEON).

Now, if you want to remember that is not some trash value but contains the value of before this instruction, this is a different story.

The tricky part here is how do you tell the compiler where does dstReg come from? Indeed, you will know that, only when you will choose it.
You could make this choice a priori, but this is not optimal either.
Anyhow, I do not think there is a straight answer for your case.

Note: I was assuming that reg2 depends on the choice of reg1, if it is not the case, then, this is slightly a different story.

-Quentin

*Hello Quentin,**
**thanks for the answer. Sadly I didn't completely get it.*

Hi Markus,

Hello,
I'm writing a backend for an architecture that only has LOAD Instructions that first copy the old value of the target register in another register and after that load the provided value into the register.

If I understand correctly, your load performs in parallel a copy and a load:
loadedVal<dstReg> load addr || <someReg> copy <dstReg>

*Yes, it does exactly that. It first copies the old register value and then performs a "nomal" load.*

*Let me give some more details about the target platform:**
**It only has 2 useable registers(AKKU1 and AKKU2), it also has a status register but that doesn't matter in this case.**
**It's only possible to directly load values into the AKKU1 register and each load into AKKU1 first copies the old AKKU1 value into AKKU2.**
**The only way to get a value into AKKU2 is to first load it into AKKU1 and then copy it(can be done via another AKKU1 load or a copy instruction).*
*There are possibilities to get a value into AKKU1/AKKU2 without changing the value of AKKU2/AKKU1 but they involve a couple of simple instructions and
that's why I don't want to it that way(runtime of the code would be quite bad).

Hello Markus,

Is the copy AKKU1 to AKKU2 mandatory?

If it is not, then the problem is much simpler because your load is a regular load with a special register class for the result of the load (i.e., a register class that only contains AKKU1).
The register allocator will split/spill the value accordingly.
You can then add a pass post register allocation that would merge adjacent load and copy.

If it is mandatory then again the hard part is taking advantage of this move.
You can add a pass to rewrite the MachineInstr IR after selection DAG and explicitly add the move. However, you will have to make a choice a priori on which value you will copy.
E.g.,
Let say you have 3 variables alive a, b, and c.
When you will see this pattern, you will have to choose which one you are going to “preserve”.
d = load addr
=>
eRC:AKKU2Only = copy a
dRC:AKKU1Only = load addr

Note: you will have to rewrite the uses of the copied variables and may have to insert phis.

The only way to get a value into AKKU2 is to first load it into AKKU1 and then copy it(can be done via another AKKU1 load or a copy instruction).
There are possibilities to get a value into AKKU1/AKKU2 without changing the value of AKKU2/AKKU1 but they involve a couple of simple instructions and
that’s why I don’t want to it that way(runtime of the code would be quite bad).

If you forget about the copy part, you can simply model your load like this:
, load addr

will be an implicit definition and your good to go.
Obviously, this is not optimal.

Do you mean a TableGen match pattern like this(AKKU1 and AKKU2 are register classes):
[(set (AKKU1:$dst, AKKU2:$dst2), (load addr:$addr))]

That obviously isn’t a valid pattern… So I don’t understand how to put that into a valid pattern**…**

Yes, this is not a valid pattern, it is something you would have to custom lower if you want to go in that direction.
That said, since you just have two registers, you may not want to do that. Indeed, you do not take advantage of the move here. As a consequence, you may not be able to allocate some code, e.g.,
a, garbage1 = load addr1
b, garbage2 = load addr2 <= this load defines two registers, thus a value has to be spilled. However, when reloading a, you will kill two registers!
add a, b

Also wouldn’t LLVM think that AKKU2 has the same value as AKKU1 after the operation and not a trash value?

No, it wouldn’t.

Note that, the original code (dot = load addr) does not define , so I guess you have some rules to assign it (like = + 1).
Therefore you may have to create a specific register class for that:
load addr
= BigDstReg.subIdx
And have a pattern using a EXTRACT_SUBREG (see ARM NEON).

So will be AKKU1 and AKKU2, and I would do sth like that as match pattern:
[(set (EXTRACT_SUBREG BigDestReg:$BigDestReg, 1), (load addr:$addr)]

Is that correct? LLVM will assume that the other part of the register is trash after the instruction?

Exactly. More precisely, the high part of this big register will never be used, so LLVM has no way to know what is inside.
Again, now that you told us that you have only two registers, you cannot sit on the move anymore.

Now, if you want to remember that is not some trash value but contains the value of before this instruction, this is a different story.

The tricky part here is how do you tell the compiler where does dstReg come from? Indeed, you will know that, only when you will choose it.
You could make this choice a priori, but this is not optimal either.
Anyhow, I do not think there is a straight answer for your case.

Note: I was assuming that reg2 depends on the choice of reg1, if it is not the case, then, this is slightly a different story.

-Quentin

Maybe it would be easier to take control of the DAG to MachineInstr step and arrange the instruction in a valid way?

I believe it would be simpler to take the control after the MachineInstr are generated.

Do you think that is a good idea? The register allocation should be pretty easy with only two registers…

Yes and no, if reloading a value kills two registers (i.e., all your allocatable space), you are in trouble.

Hope that helps!

-Quentin