Possible error in LegalizeDAG

I'm still trying to track down some alignment issues with loads(i.e.
8/16 bit loads being turned into 32bit sign extending loads) and I
cannot for the life of me seem to figure out how to enter this section
of code:

// If this is an unaligned load and the target doesn't support it,

            // expand it.

            if (!TLI.allowsUnalignedMemoryAccesses()) {

              unsigned ABIAlignment = TLI.getTargetData()->

                getABITypeAlignment(LD->getMemoryVT().getTypeForMVT());

              if (LD->getAlignment() < ABIAlignment){

                Result =
ExpandUnalignedLoad(cast<LoadSDNode>(Result.getNode()), DAG,

                                             TLI);

                Tmp1 = Result.getOperand(0);

                Tmp2 = Result.getOperand(1);

                Tmp1 = LegalizeOp(Tmp1);

                Tmp2 = LegalizeOp(Tmp2);

              }

            }

This is from LegalizeDAG.cpp:2146

The problem that I see is that LD->getAlignment() is set via the call
getMVTAlignment(VT) in SelectionDAG.cpp:3385, which in turn calls
TLI.getTargetData()->getABITypeAlignment(Ty).

So, the statement if (LD->getALignment() < ABIAlignment) always fails
from what I can see. Even if I set in my DataLayout that i8 should have
a 32bit ABI alignment, this does not work because the load alignment is
set to the ABI alignment instead of being set based on the actual bit
size.

Any hints would be greatly appreciated, this is a blocking issue that I
just cannot seem to resolve without modifying the LLVM codebase to
remove the extend + load -> extload combining step.

Micah Villmow

Systems Engineer

Advanced Technology & Performance

Advanced Micro Devices Inc.

S1-609 One AMD Place

Sunnyvale, CA. 94085

P: 408-749-3966

I'm still trying to track down some alignment issues with loads(i.e. 8/16
bit loads being turned into 32bit sign extending loads) and I cannot for the
life of me seem to figure out how to enter this section of code:

// If this is an unaligned load and the target doesn't support it,

            // expand it.

Why do you expect to enter this section of code? It's impossible for
an i8 load to be unaligned.

Any hints would be greatly appreciated, this is a blocking issue that I just
cannot seem to resolve without modifying the LLVM codebase to remove the
extend + load -> extload combining step.

LLVM will "uncombine" it for you if you use setLoadExtAction with the
appropriate arguments.

-Eli

I'm still trying to track down some alignment issues with loads(i.e.

8/16

bit loads being turned into 32bit sign extending loads) and I cannot

for the

life of me seem to figure out how to enter this section of code:

// If this is an unaligned load and the target doesn't support it,

            // expand it.

Why do you expect to enter this section of code? It's impossible for
an i8 load to be unaligned.

On the hardware that I am targeting, which is not a CPU, I must support
i8 loads, however the hardware only supports natively 32bit aligned
loads, therefore I have to read in 4 i8's and unpack them and shift them
based on the read address. So any i8 load has a 75% chance of being
unaligned on my hardware, so I need a way to tell LLVM not to generate
sext_loads, or if it does to expand them. Everything that has been
suggested so far has not worked.

Any hints would be greatly appreciated, this is a blocking issue that

I just

cannot seem to resolve without modifying the LLVM codebase to remove

the

extend + load -> extload combining step.

LLVM will "uncombine" it for you if you use setLoadExtAction with the
appropriate arguments.

-Eli

I've tried setting setLoadXAction to Custom, Legal, Expand and Promote.
I assert somewhere when I try to custom expanding this operation because
it expects it to be a certain form, but my custom load instruction has a
different form. Setting it to Legal generates the sext_load in the first
dag combine pass, because it never checks if it should make this
combination. Since it doesn't enter the section of code I mentioned
earlier, it never uncombines it. When I set it to promote, it asserts on
"not yet implemented". Setting it to Expand does not expand it to
sign_extend and load but to extload and sign_extend, but I don't
support extload either.

Please correct me if I am wrong, but I've been looking at this issue for
awhile now and I cannot see where it uncombines the sextload to a load
and sign_extension.

My current solution is to just comment out that combination so that it
never occurs.

Thanks,
Micah

On the hardware that I am targeting, which is not a CPU, I must support
i8 loads, however the hardware only supports natively 32bit aligned
loads, therefore I have to read in 4 i8's and unpack them and shift them
based on the read address. So any i8 load has a 75% chance of being
unaligned on my hardware,

Oh, okay, makes sense.

I've tried setting setLoadXAction to Custom, Legal, Expand and Promote.
Setting it to Expand does not expand it to
sign_extend and load but to extload and sign_extend, but I don't
support extload either.

I suppose you could consider that a bug. That said, why is this
difficult to implement? You can just treat an extload of an i8 as a
load of an i8 and get correct code, no?

-Eli

From: llvmdev-bounces@cs.uiuc.edu [mailto:llvmdev-bounces@cs.uiuc.edu]

On

Behalf Of Eli Friedman
Sent: Thursday, February 19, 2009 11:18 AM
To: LLVM Developers Mailing List
Subject: Re: [LLVMdev] Possible error in LegalizeDAG

> On the hardware that I am targeting, which is not a CPU, I must

support

> i8 loads, however the hardware only supports natively 32bit aligned
> loads, therefore I have to read in 4 i8's and unpack them and shift

them

> based on the read address. So any i8 load has a 75% chance of being
> unaligned on my hardware,

Oh, okay, makes sense.

> I've tried setting setLoadXAction to Custom, Legal, Expand and

Promote.

> Setting it to Expand does not expand it to
> sign_extend and load but to extload and sign_extend, but I don't
> support extload either.

I suppose you could consider that a bug. That said, why is this
difficult to implement? You can just treat an extload of an i8 as a
load of an i8 and get correct code, no?

[Micah Villmow] The problem with the extload is that it is still
generating a 32bit extload instead of an 8bit extload.