define <3 x i32> @test60(<4 x i32> %call4) {
; CHECK-LABEL: @test60(
; CHECK-NEXT: [[P10:%.*]] = shufflevector <4 x i32> [[CALL4:%.*]],
<4 x i32> undef, <3 x i32> <i32 0, i32 1, i32 2>
; CHECK-NEXT: ret <3 x i32> [[P10]]
;
%p11 = bitcast <4 x i32> %call4 to i128
%p9 = trunc i128 %p11 to i96
%p10 = bitcast i96 %p9 to <3 x i32>
ret <3 x i32> %p10
}
If we assume the input vector is e.g. <1, 2, 3, 4> then I assume %p11
would be the (hex) value 1234, %p9 would be the 234 and %p10 would then
be the vector <2, 3, 4>.
Am I right, or am I missing something here? Note that the datalayout
says we're using big endian.
But the CHECK-NEXT checks that the result is made up of the elements at
index 0, 1 and 2 from the input vector, which would be <1, 2, 3>.
Looks broken to me - we need to consider big/little-endian datalayout when bitcasting to/from vectors.
We should have some documentation for this in the LangRef, but I don’t see anything currently.
Thanks Sanjay for confirming that this seem to be broken.
I’ll check with Mikael to make sure we write a new PR (probably have to wait until tomorrow).
PS. This kind of confirms that DAGTypeLegalizer::PromoteIntRes_BITCAST is doing the wrong thing for big-endian as well, see https://bugs.llvm.org/show_bug.cgi?id=44135 , so maybe we can set that PR to “confirmed” (and then move forward with making a proper patch based on the workaround presented in that PR).
Looks broken to me - we need to consider big/little-endian datalayout
when bitcasting to/from vectors.
Great, that's what I thought.
I ran into this problem in a real miscompile for our out-of-tree
target, did a tentative fix to instcombine, but then noticed that
test60 in cast.ll failed and got a bit confused/worried.
We should have some documentation for this in the LangRef, but I
don't see anything currently.
The transform in question was added here: rG02b0df5338e0
Yes, from 2010. It also added to my confusion that the bug had lived
for so long when I saw the code was added in that commit, hence the
email.
You can find other vector bitcast transforms that (hopefully
correctly...) account for the datalayout difference for vector
elements.
Example:
So we need something like this:
Good, then I'll write a PR for the instcombine issue and broken test
and probably also put up a patch for review.
I think test61 in cast.ll (checking what happens when we
bitcast/zext/bitcast) is also broken, I think it inserts zeroes at the
wrong end for big-endian targets so I'll include something about that
too.
As Björn mentioned we've seen a similar issue (PR44135) in the
DAGLegalizer so it seems vectors on big-endian machines aren't that
well tested, at least some code patterns aren't.
We triggered both these problems after enabling unrolling for our
target, so I suppose that resulted in some new code patterns. We'll see
if anything more comes out of that.
> Looks broken to me - we need to consider big/little-endian
> datalayout
> when bitcasting to/from vectors.
Great, that's what I thought.
I ran into this problem in a real miscompile for our out-of-tree
target, did a tentative fix to instcombine, but then noticed that
test60 in cast.ll failed and got a bit confused/worried.
> We should have some documentation for this in the LangRef, but I
> don't see anything currently.
>
> The transform in question was added here:
>
Yes, from 2010. It also added to my confusion that the bug had
lived
for so long when I saw the code was added in that commit, hence the
email.
> You can find other vector bitcast transforms that (hopefully
> correctly...) account for the datalayout difference for vector
> elements.
>
> Example:
>
>
>
> So we need something like this:
>
>
>
Good, then I'll write a PR for the instcombine issue and broken test
and probably also put up a patch for review.
I think test61 in cast.ll (checking what happens when we
bitcast/zext/bitcast) is also broken, I think it inserts zeroes at
the
wrong end for big-endian targets so I'll include something about that
too.
As Björn mentioned we've seen a similar issue (PR44135) in the
DAGLegalizer so it seems vectors on big-endian machines aren't that
well tested, at least some code patterns aren't.
We triggered both these problems after enabling unrolling for our
target, so I suppose that resulted in some new code patterns. We'll
see
if anything more comes out of that.
Thanks,
Mikael
> > Hi,
> >
> > In
> > llvm/test/Transforms/InstCombine/cast.ll
> > there is a test like this:
> >
> > target datalayout = "E-p:64:64:64-p1:32:32:32-p2:64:64:64-
> > p3:64:64:64-
> > a0:0:8-f32:32:32-f64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-
> > i64:32:64-
> > v64:64:64-v128:128:128-n8:16:32:64"
> >
> > [...]
> >
> > define <3 x i32> @test60(<4 x i32> %call4) {
> > ; CHECK-LABEL: @test60(
> > ; CHECK-NEXT: [[P10:%.*]] = shufflevector <4 x i32>
> > [[CALL4:%.*]],
> > <4 x i32> undef, <3 x i32> <i32 0, i32 1, i32 2>
> > ; CHECK-NEXT: ret <3 x i32> [[P10]]
> > ;
> > %p11 = bitcast <4 x i32> %call4 to i128
> > %p9 = trunc i128 %p11 to i96
> > %p10 = bitcast i96 %p9 to <3 x i32>
> > ret <3 x i32> %p10
> >
> > }
> >
> > If we assume the input vector is e.g. <1, 2, 3, 4> then I assume
> > %p11
> > would be the (hex) value 1234, %p9 would be the 234 and %p10
> > would
> > then
> > be the vector <2, 3, 4>.
> >
> > Am I right, or am I missing something here? Note that the
> > datalayout
> > says we're using big endian.
> >
> > But the CHECK-NEXT checks that the result is made up of the
> > elements at
> > index 0, 1 and 2 from the input vector, which would be <1, 2, 3>.
> >
> > So, broken testcase or am I missing something?
> >
> > Thanks,
> > Mikael
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev@lists.llvm.org
> >