InstCombine question on combineLoadToOperationType

Pete_Couperus · November 16, 2016, 12:22am

Hello,

Context: We have a backend where v32i1 is a Legal type, but the storage for v32i1 is not 32-bits/uses a different instruction sequence.

We ran into an issue because combineLoadToOperationType changed v32i1 loads into i32 loads, so a sequence like:

define void @bits(<32 x i1>* %A, <32 x i1>* %B) {

%a = load <32 x i1>, <32 x i1>* %A

store <32 x i1> %a, <32 x i1>* %B

ret void

}

Is transformed to:

define void @bits(<32 x i1>* %A, <32 x i1>* %B) {

%1 = bitcast <32 x i1>* %A to i32*

%a1 = load i32, i32* %1, align 4

%2 = bitcast <32 x i1>* %B to i32*

store i32 %a1, i32* %2, align 4

ret void

}

This looks to be intentional.

Is there a way to specify in the data-layout that v32i1 storage is not 32-bits?

Absent that, is there any other reliable way to retain the original vector loads/store without just disabling this part of InstCombine?

Or is it the backend’s responsibility to try and work with this?

Thanks!

Pete

Eli_Friedman · November 16, 2016, 7:23pm

No, not at the moment. You could propose something, but you’d probably have a hard time convincing anyone it’s necessary; nobody has cared about this for a very long time. No, and you’ll run into other problems (e.g. alias analysis) if the data layout lies about the size of a load or store. Where are these loads coming from? x86 without AVX512 doesn’t have any convenient way generate code for a <32 x i1> store, but it doesn’t matter because frontends don’t generate loads and stores. If you have a frontend which is generating loads and stores like this, you could probably change it to use some other sequence (like a platform-specific intrinsic, or some sequence involving sext/trunc). -Eli

Pete_Couperus · November 17, 2016, 4:28pm

Hello,

Context: We have a backend where v32i1 is a Legal type, but the storage for v32i1 is not 32-bits/uses a different instruction sequence.
We ran into an issue because combineLoadToOperationType changed v32i1 loads into i32 loads, so a sequence like:
define void @bits(<32 x i1>* %A, <32 x i1>* %B) {
  %a = load <32 x i1>, <32 x i1>* %A
  store <32 x i1> %a, <32 x i1>* %B
  ret void
}

Is transformed to:
define void @bits(<32 x i1>* %A, <32 x i1>* %B) {
  %1 = bitcast <32 x i1>* %A to i32*
  %a1 = load i32, i32* %1, align 4
  %2 = bitcast <32 x i1>* %B to i32*
  store i32 %a1, i32* %2, align 4
  ret void
}

This looks to be intentional.
Is there a way to specify in the data-layout that v32i1 storage is not 32-bits?

No, not at the moment. You could propose something, but you'd probably have a hard time convincing anyone it's necessary; nobody has cared about this for a very long time.

Absent that, is there any other reliable way to retain the original vector loads/store without just disabling this part of InstCombine?

No, and you'll run into other problems (e.g. alias analysis) if the data layout lies about the size of a load or store.

Or is it the backend’s responsibility to try and work with this?

Where are these loads coming from? x86 without AVX512 doesn't have any convenient way generate code for a <32 x i1> store, but it doesn't matter because frontends don't generate <N x i1> loads and stores.

If you have a frontend which is generating loads and stores like this, you could probably change it to use some other sequence (like a platform-specific intrinsic, or some sequence involving sext/trunc).

We do have a frontend that can generate <32 x i1> loads/stores, though it is rare that these are inst-combined to i32 loads/stores like here (these were only illustrative examples).
I’m trying to decide what the best way to remedy this is, and this info and suggestions help.
Thanks!

Pete

mehdi_amini · November 17, 2016, 10:10pm

Why not just generating the code with the proper storage? If <32 x i1> are used where the storage is <32 x i8> (for example), it seems a bad idea to lie to the IR and hide it with platform-specific intrinsic, right? I fear this would cause other problem down the line in the optimizer.

Topic		Replies	Views
Memcpy expansion: InstCombine vs SelectionDAG LLVM Dev List Archives	3	89	January 14, 2014
[global-isel] Type-independence of load/store LLVM Dev List Archives	5	74	August 13, 2013
InstCombine introduces inttoptr, violating GC assumptions LLVM Dev List Archives	0	70	December 15, 2015
RFC: A change in InstCombine canonical form LLVM Dev List Archives	32	84	March 29, 2016
postindexed load/store LLVM Dev List Archives	1	71	December 26, 2016

InstCombine question on combineLoadToOperationType

Related Topics