Question regarding getElementPtr/Addressing modes in backend

Hi,

I am writing a backend and having issues with properly lowering the result of getElementPtr ( specifically the add node that it generates).

If we take this IR:

%struct.rectangle = type { i24, i24 }

; Function Attrs: nounwind readonly
define i24 @area(%struct.rectangle* nocapture readonly %r) #0 {
entry:
%width = getelementptr inbounds %struct.rectangle* %r, i16 0, i32 0
%0 = load i24* %width, align 4, !tbaa !1
%height = getelementptr inbounds %struct.rectangle* %r, i16 0, i32 1
%1 = load i24* %height, align 4, !tbaa !6
%mul = mul nsw i24 %1, %0
ret i24 %mul
}

The DAG before isel would look like isel.png.

I would then pattern match the load nodes with:

[(set i24:$val, (load addr:$addr))]

Where addr is a complex pattern. This is fine for the 0th element in the struct, as there is no offset so the resultant assembly would be (which works fine):

r0 = *(i0)

The issue I have is with the second load. I need the result to be:

m0 = 4
r1 = *(i0 += m0) (offset is stored in a register(m0), and modifies the original pointer)

rather than

r1 = *(i0 + 4) (immediate)

(The specific registers numbers don’t matter).

I’m not sure how to achieve that. Currently addr gets matched in SelectAddr in XYZISelDAGToDAG.cpp. It will match the add node and “combine” (I’m not sure this is the correct terminology) it into my LDR node. The result of this can be seen sched.png

So instead of that TargetConst<4> I need a something that copies 4 into an M register, and then LDR uses that.

I’m not sure if I should put logic in SelectAddr to create a virtual register and create a series of copyfromreg/copytoreg to copy the constant into an M register and then pass that to the node? I feel like that is fairly ugly (as no other back-end does this), but I’m not sure what a better way would be.

I have tried looking through other backends for an example, but couldn’t find something similar. That could be due to me not knowing where to look.

Thanks in advance for any help.

Johnny

You might need to modify selectAddr so it doesn't fold the add node but instead returns two operands: one for the pointer and one for the offset. You can control the register classes for the two components of the address using MIOperandInfo so your base pointer can be IClass and the offset MClass. See SparcInstrInfo.td (and probably others) for examples.

I'm assuming you meant r1 = *(i0 + m0) rather than r1 = *(i0 += m0)? I don't know how you'd achieve that latter.

Steve Montgomery

Hi Steve,

Thanks for the tip regarding MIOperandInfo, I didn’t think of that part of the tablegen description.

Sadly, I did actually mean: r1 = *(i0 += m0).

So increment i0 by m0. Read memory the memory location “pointed” to by i0. Store in r1. Sadly I am not too familiar with compiler terminology, so I don’t know if there is a proper term for such a load.

Many CPU instruction sets have “autoincrement” / “postincrement” (pop) / “predecrement” (push) instructions that dereference a pointer register and also alter it by the size of the load or store.

The PowerPC has “load byte/short/word with update” instructions that add an arbitrary positive or negative literal offset (unrelated to the load/store size) to the pointer register BEFORE the load or store, and update the register with the final address.

So maybe the PowerPC back end is worth looking at?

From: "Bruce Hoult" <bruce@hoult.org>
To: "Johnny Val" <johnnydval@gmail.com>
Cc: "<llvmdev@cs.uiuc.edu>" <llvmdev@cs.uiuc.edu>
Sent: Thursday, October 23, 2014 8:31:35 AM
Subject: Re: [LLVMdev] Question regarding getElementPtr/Addressing modes in backend

Many CPU instruction sets have "autoincrement" / "postincrement"
(pop) / "predecrement" (push) instructions that dereference a
pointer register and also alter it by the size of the load or store.

The PowerPC has "load byte/short/word with update" instructions that
add an arbitrary positive or negative literal offset (unrelated to
the load/store size) to the pointer register BEFORE the load or
store, and update the register with the final address.

So maybe the PowerPC back end is worth looking at?

Yes, and so it ARM (which has both pre- and post-increment loads/stores).

For PowerPC, for example, this works as follows:

In PPCISelLowering.cpp, we have this:

  // PowerPC has pre-inc load and store's.
  setIndexedLoadAction(ISD::PRE_INC, MVT::i1, Legal);
  setIndexedLoadAction(ISD::PRE_INC, MVT::i8, Legal);
  setIndexedLoadAction(ISD::PRE_INC, MVT::i16, Legal);
  setIndexedLoadAction(ISD::PRE_INC, MVT::i32, Legal);
  setIndexedLoadAction(ISD::PRE_INC, MVT::i64, Legal);
  setIndexedStoreAction(ISD::PRE_INC, MVT::i1, Legal);
  setIndexedStoreAction(ISD::PRE_INC, MVT::i8, Legal);
  setIndexedStoreAction(ISD::PRE_INC, MVT::i16, Legal);
  setIndexedStoreAction(ISD::PRE_INC, MVT::i32, Legal);
  setIndexedStoreAction(ISD::PRE_INC, MVT::i64, Legal);

Then the DAGCombiner calls into the backend using the function PPCTargetLowering::getPreIndexedAddressParts, which tells DAGCombine when it can combine the the load and the increment into a singe pre/post-increment load or store.

The pre-increment stores are instruction-selected in PPCInstrInfo.td (look for the pre_store patterns). The pre-increment loads are selected in PPCISelDAGToDAG.cpp (TableGen can't yet handle instructions returning multiple outputs), look for the code just after ISD::PRE_INC around line 1060.

From your formula below, it looks like you have a pre-increment load.

-Hal

Thank you for all the help.

I feel like I am pretty close to getting this working for my architecture. The issue I am having right now is that I cannot get the DAGCombiner to lower/combine a load/store into a pre-indexed load/store.

I believe the “issue” is with this code(DAGCombiner.cpp:7841):

// If the pointer is not an add/sub, or if it doesn’t have multiple uses, bail
// out. There is no reason to make this a preinc/predec.
if ((Ptr.getOpcode() != ISD::ADD && Ptr.getOpcode() != ISD::SUB) ||
Ptr.getNode()->hasOneUse())
return false;

Specifically the “Ptr.GetNode()->hasOneUse()” statement. It seems like a fair assumption to make as most architectures probably have the option for “normal” loads as well as preinc/predec loads. Sadly my architecture only supports preinc addressing when accessing memory.

What would you suggest to do in this case? Create custom lowering for ISD::STORE/LOAD in my backend? Or do you think it’s better to patch DAGCombiner.cpp to allow targets to disable that check?

Cheers,

Johnny

From: "Johnny Val" <johnnydval@gmail.com>
To: "Hal Finkel" <hfinkel@anl.gov>
Cc: "Bruce Hoult" <bruce@hoult.org>, "llvmdev" <llvmdev@cs.uiuc.edu>
Sent: Friday, October 24, 2014 7:22:27 AM
Subject: Re: [LLVMdev] Question regarding getElementPtr/Addressing modes in backend

Thank you for all the help.

I feel like I am pretty close to getting this working for my
architecture. The issue I am having right now is that I cannot get
the DAGCombiner to lower/combine a load/store into a pre-indexed
load/store.

I believe the "issue" is with this code(DAGCombiner.cpp:7841):

// If the pointer is not an add/sub, or if it doesn't have multiple
uses, bail
// out. There is no reason to make this a preinc/predec.
if ((Ptr.getOpcode() != ISD::ADD && Ptr.getOpcode() != ISD::SUB) ||
Ptr.getNode()->hasOneUse())
return false;

Specifically the "Ptr.GetNode()->hasOneUse()" statement. It seems
like a fair assumption to make as most architectures probably have
the option for "normal" loads as well as preinc/predec loads. Sadly
my architecture only supports preinc addressing when accessing
memory.

What would you suggest to do in this case? Create custom lowering for
ISD::STORE/LOAD in my backend? Or do you think it's better to patch
DAGCombiner.cpp to allow targets to disable that check?

Yes, you can custom lower the loads and stores and do the rewriting in the backend. You can also just write a separate load/store pattern that feed in a '0' to the base that gets updated. Regardless of how you do it, you should handle this in the backend.

-Hal