Hello,
i am experimenting with the increase in register/ vector width to 64 elements of 32 bits instead of 16 in x86 backend.
for eg.
i have a loop with 65 iterations;
if my IR generates v64i32 and 1 scalar, still the backend breaks the v64i32 into 4 v16i32. i want it to retain v64i32. like if there are 128 elements in loop then it should break it into 2 v64i32 instructions.
in order to do this i have made necessary changes in X86ISelLowering.cpp. and rebuild llvm. then when i use the command -view-dag-combine2-dags i get the required output in graph but the following error on console:
LLVM ERROR: Cannot select: t10: ch = store<ST256bitcast ([65 x i32]* @a to <64 x i32>*)(tbaa=<0x30c5438>)> t9, t7, t12, undef:i64
t7: v64i32 = add t6, t4
t6: v64i32,ch = load<LD256bitcast ([65 x i32]* @c to <64 x i32>*)(tbaa=<0x30c5438>)(dereferenceable)> t0, t14, undef:i64
t14: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x i32]* @c> 0
t13: i64 = TargetGlobalAddress<[65 x i32]* @c> 0
t3: i64 = undef
t4: v64i32,ch = load<LD256bitcast ([65 x i32]* @b to <64 x i32>*)(tbaa=<0x30c5438>)(dereferenceable)> t0, t16, undef:i64
t16: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x i32]* @b> 0
t15: i64 = TargetGlobalAddress<[65 x i32]* @b> 0
t3: i64 = undef
t12: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x i32]* @a> 0
t11: i64 = TargetGlobalAddress<[65 x i32]* @a> 0
t3: i64 = undef
In function: foo
The dag after legalization is also attached here.
the source is vector sum of 65 elements.
Kindly correct me.
65_dagcmbine2.pdf (16.2 KB)
Please correct me i m stuck at this point.
Thank You. well i have seen these links. but they dont cover the problem that i have mentioned. actually i am doing all the things step by step.
so i havent yet worked with instruction selection phase/ files. rather before that i am trying to do legalization by allowing vector elements>16 i.e 64xi32. here i have mainly worked with 2 files uptil now, i.e registerinfo.td to define register class to be called in legalization. and most importantly i am dealing with file X86ISelLowering.cpp.
Now is there any relation in this and instruction selection. since instruction selection comes after combine and legalize so i havent yet worked on it.
Please correct me, I am stuck here.
Thank You again
also i further run the following command;
llc -debug filer-knl_o3.ll
and its output is attached here. by looking at the output can we say that legalization runs fine and the error is due to instruction selection/ pattern matching which is not yet implemented?
so do i need to worry and try to correct it at this stage or should i move forward to implement instruction selection/ pattern matching?
Please guide me.
Thank You
debug-legalize.txt (18.6 KB)
Yes, that error is from instruction selection. I think your legalization changes worked fine.
What is meant by folded instructions in LLVM?
How they work?
The word "fold" is used all over LLVM. It generally refers to transformations which delete an instruction.
If you're asking about http://llvm.org/docs/CodeGenerator.html#instruction-folding , it just means an instruction which was produced by the "instruction folding" transform; there isn't anything special about the instruction itself.
-Eli