Hi Duncan,
1) What causes the Initial selection DAG code to choose an any_extend over a sign_extend (or zero_extend)?
because it is more efficient: the backend gets more choice in how to do
it, and at the same time it tells the optimizers that the extra bits
contain rubbish, which gives them more freedom to reason.
Makes sense, though I was wondering why it would choose to sign_extend an 8-bit or 16-bit value, but any_extend a 32-bit value (These are all signed values).
I'm not sure what you mean by "these are all signed values" - in LLVM there
is no such thing as a "signed i16" or an "unsigned i16", there is only i16
with signed and unsigned operations. As to why you get a sign-extend in
some cases and any-extend in others... well, it depends on details of what
you are compiling, so without a testcase it is hard to say anything useful.
Yes, I could have worded that better. The particular test case I'm using is the following bit of c, which is then compiled to LLVM using Clang without any optimisation:
int func(void)
{
return 0;
}
I then compile the same thing with func returning char or short (which are all signed types, which I realise isn't reflected in the LLVM types).
Clang generates the following LLVM:
; ModuleID = 'test.c'
target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128"
target triple = "x86_64-unknown-linux-gnu"
define i32 @func() nounwind {
entry:
%retval = alloca i32 ; <i32*> [#uses=2]
store i32 0, i32* %retval
%0 = load i32* %retval ; <i32> [#uses=1]
ret i32 %0
}
When I use char or short it generates the same code, only using i16 or i8, plus (I now realise having looked at it more closely) one more crucial difference, it adds signext to the function definition in the i8 and i16 cases like so:
define signext i16 @func() nounwind {
Which is where the sign_extend/any_extend difference comes from
So it's Clang that makes the decision.
Out of interest the Mips backend does deal with any_extend, it just does it using Pat to map them directly to unsigned loads, rather than defining them as a Mips instruction. Which is why I missed them when extending it for 64-bit.
Cheers,
Greg