Instcombine and bitcast of vector. Wrong CHECKs in cast.ll, miscompile in instcombine?

mikaelholmen · November 28, 2019, 2:12pm

Hi,

In
llvm/test/Transforms/InstCombine/cast.ll
there is a test like this:

target datalayout = "E-p:64:64:64-p1:32:32:32-p2:64:64:64-p3:64:64:64-
a0:0:8-f32:32:32-f64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-
v64:64:64-v128:128:128-n8:16:32:64"

[...]

define <3 x i32> @test60(<4 x i32> %call4) {
; CHECK-LABEL: @test60(
; CHECK-NEXT: [[P10:%.*]] = shufflevector <4 x i32> [[CALL4:%.*]],
<4 x i32> undef, <3 x i32> <i32 0, i32 1, i32 2>
; CHECK-NEXT: ret <3 x i32> [[P10]]
;
  %p11 = bitcast <4 x i32> %call4 to i128
  %p9 = trunc i128 %p11 to i96
  %p10 = bitcast i96 %p9 to <3 x i32>
  ret <3 x i32> %p10

}

If we assume the input vector is e.g. <1, 2, 3, 4> then I assume %p11
would be the (hex) value 1234, %p9 would be the 234 and %p10 would then
be the vector <2, 3, 4>.

Am I right, or am I missing something here? Note that the datalayout
says we're using big endian.

But the CHECK-NEXT checks that the result is made up of the elements at
index 0, 1 and 2 from the input vector, which would be <1, 2, 3>.

So, broken testcase or am I missing something?

Thanks,
Mikael

rotateright · November 28, 2019, 3:55pm

Looks broken to me - we need to consider big/little-endian datalayout when bitcasting to/from vectors.
We should have some documentation for this in the LangRef, but I don’t see anything currently.

The transform in question was added here:
https://reviews.llvm.org/rL103354

You can find other vector bitcast transforms that (hopefully correctly…) account for the datalayout difference for vector elements.

Example:

https://github.com/llvm/llvm-project/blob/master/llvm/test/Transforms/InstCombine/bitcast-bigendian.ll#L10
https://github.com/llvm/llvm-project/blob/master/llvm/test/Transforms/InstCombine/bitcast.ll#L290

So we need something like this:
https://github.com/llvm/llvm-project/blob/master/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp#L487

bjope · November 28, 2019, 4:20pm

Thanks Sanjay for confirming that this seem to be broken.

I’ll check with Mikael to make sure we write a new PR (probably have to wait until tomorrow).

PS. This kind of confirms that DAGTypeLegalizer::PromoteIntRes_BITCAST is doing the wrong thing for big-endian as well, see https://bugs.llvm.org/show_bug.cgi?id=44135 , so maybe we can set that PR to “confirmed” (and then move forward with making a proper patch based on the workaround presented in that PR).

/Björn

mikaelholmen · November 29, 2019, 6:25am

Looks broken to me - we need to consider big/little-endian datalayout
when bitcasting to/from vectors.

Great, that's what I thought.

I ran into this problem in a real miscompile for our out-of-tree
target, did a tentative fix to instcombine, but then noticed that
test60 in cast.ll failed and got a bit confused/worried.

We should have some documentation for this in the LangRef, but I
don't see anything currently.

The transform in question was added here:
rG02b0df5338e0

Yes, from 2010. It also added to my confusion that the bug had lived
for so long when I saw the code was added in that commit, hence the
email.

You can find other vector bitcast transforms that (hopefully
correctly...) account for the datalayout difference for vector
elements.

Example:

github.com

llvm/llvm-project/blob/main/llvm/test/Transforms/InstCombine/bitcast-bigendian.ll#L10


      
          ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
          ; RUN: opt < %s -passes=instcombine -S | FileCheck %s
          
          target datalayout = "E-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-f128:128:128-v128:128:128-n32:64"
          target triple = "powerpc64-unknown-linux-gnu"
          
          ; These tests are extracted from bitcast.ll.
          ; Verify that they also work correctly on big-endian targets.
          
          define float @test2(<2 x float> %A, <2 x i32> %B) {
          ; CHECK-LABEL: @test2(
          ; CHECK-NEXT:    [[TMP24:%.*]] = extractelement <2 x float> [[A:%.*]], i64 1
          ; CHECK-NEXT:    [[BC:%.*]] = bitcast <2 x i32> [[B:%.*]] to <2 x float>
          ; CHECK-NEXT:    [[TMP4:%.*]] = extractelement <2 x float> [[BC]], i64 1
          ; CHECK-NEXT:    [[ADD:%.*]] = fadd float [[TMP24]], [[TMP4]]
          ; CHECK-NEXT:    ret float [[ADD]]
          ;
            %tmp28 = bitcast <2 x float> %A to i64
            %tmp23 = trunc i64 %tmp28 to i32
            %tmp24 = bitcast i32 %tmp23 to float

github.com

llvm/llvm-project/blob/main/llvm/test/Transforms/InstCombine/bitcast.ll#L290


      
            %t6 = select i1 %cmp, float %t4, float %y
            %t7 = bitcast float %t6 to <4 x i8>
            ret <4 x i8> %t7
          }
          
          define <4 x float> @bitcast_scalar_select_of_vectors(<4 x float> %x, <2 x i64> %y, i1 %cmp) {
          ; CHECK-LABEL: @bitcast_scalar_select_of_vectors(
          ; CHECK-NEXT:    [[TMP1:%.*]] = bitcast <2 x i64> [[Y:%.*]] to <4 x float>
          ; CHECK-NEXT:    [[T7:%.*]] = select i1 [[CMP:%.*]], <4 x float> [[X:%.*]], <4 x float> [[TMP1]]
          ; CHECK-NEXT:    ret <4 x float> [[T7]]
          ;
            %t4 = bitcast <4 x float> %x to <4 x i32>
            %t5 = bitcast <2 x i64> %y to <4 x i32>
            %t6 = select i1 %cmp, <4 x i32> %t4, <4 x i32> %t5
            %t7 = bitcast <4 x i32> %t6 to <4 x float>
            ret <4 x float> %t7
          }
          
          ; Can't change the type of the vector select if the dest type is scalar.
          
          define float @bitcast_vector_select_no_fold1(float %x, <2 x i16> %y, <4 x i1> %cmp) {

So we need something like this:

github.com

llvm/llvm-project/blob/main/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp#L487


      
            if (match(L, m_And(m_Value(X), m_SpecificInt(Mask))) &&
                match(R, m_And(m_Neg(m_Specific(X)), m_SpecificInt(Mask))))
              return X;
          
            // Same as above, but the shift amount may be extended after masking:
            if (match(L, m_ZExt(m_And(m_Value(X), m_SpecificInt(Mask)))) &&
                match(R, m_ZExt(m_And(m_Neg(m_Specific(X)), m_SpecificInt(Mask)))))
              return X;
          
            return nullptr;
          };
          
          Value *ShAmt = matchShiftAmount(ShAmt0, ShAmt1, NarrowWidth);
          bool IsFshl = true; // Sub on LSHR.
          if (!ShAmt) {
            ShAmt = matchShiftAmount(ShAmt1, ShAmt0, NarrowWidth);
            IsFshl = false; // Sub on SHL.
          }
          if (!ShAmt)
            return nullptr;

Good, then I'll write a PR for the instcombine issue and broken test
and probably also put up a patch for review.

I think test61 in cast.ll (checking what happens when we
bitcast/zext/bitcast) is also broken, I think it inserts zeroes at the
wrong end for big-endian targets so I'll include something about that
too.

As Björn mentioned we've seen a similar issue (PR44135) in the
DAGLegalizer so it seems vectors on big-endian machines aren't that
well tested, at least some code patterns aren't.

We triggered both these problems after enabling unrolling for our
target, so I suppose that resulted in some new code patterns. We'll see
if anything more comes out of that.

Thanks,
Mikael

mikaelholmen · November 29, 2019, 7:52am

I wrote

44178 – [InstCombine] Miscompile of bitcast/zext/trunc/bitcast on vectors for big-endian targets

about this.

Thanks,
Mikael

> Looks broken to me - we need to consider big/little-endian
> datalayout
> when bitcasting to/from vectors.

Great, that's what I thought.

I ran into this problem in a real miscompile for our out-of-tree
target, did a tentative fix to instcombine, but then noticed that
test60 in cast.ll failed and got a bit confused/worried.

> We should have some documentation for this in the LangRef, but I
> don't see anything currently.
>
> The transform in question was added here:
>

https://protect2.fireeye.com/v1/url?k=0f6f3328-53e5e7e9-0f6f73b3-863d9bcb726f-48266812a3c68cf9&q=1&e=17c6716e-878a-4ce0-b201-b3607d52d122&u=https%3A%2F%2Freviews.llvm.org%2FrL103354

>

Yes, from 2010. It also added to my confusion that the bug had
lived
for so long when I saw the code was added in that commit, hence the
email.

> You can find other vector bitcast transforms that (hopefully
> correctly...) account for the datalayout difference for vector
> elements.
>
> Example:
>

github.com

llvm/llvm-project/blob/main/llvm/test/Transforms/InstCombine/bitcast-bigendian.ll

; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
; RUN: opt < %s -passes=instcombine -S | FileCheck %s

target datalayout = "E-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-f128:128:128-v128:128:128-n32:64"
target triple = "powerpc64-unknown-linux-gnu"

; These tests are extracted from bitcast.ll.
; Verify that they also work correctly on big-endian targets.

define float @test2(<2 x float> %A, <2 x i32> %B) {
; CHECK-LABEL: @test2(
; CHECK-NEXT:    [[TMP24:%.*]] = extractelement <2 x float> [[A:%.*]], i64 1
; CHECK-NEXT:    [[BC:%.*]] = bitcast <2 x i32> [[B:%.*]] to <2 x float>
; CHECK-NEXT:    [[TMP4:%.*]] = extractelement <2 x float> [[BC]], i64 1
; CHECK-NEXT:    [[ADD:%.*]] = fadd float [[TMP24]], [[TMP4]]
; CHECK-NEXT:    ret float [[ADD]]
;
  %tmp28 = bitcast <2 x float> %A to i64
  %tmp23 = trunc i64 %tmp28 to i32
  %tmp24 = bitcast i32 %tmp23 to float

This file has been truncated. show original

>

github.com

llvm/llvm-project/blob/main/llvm/test/Transforms/InstCombine/bitcast.ll

; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
; RUN: opt < %s -passes=instcombine -S | FileCheck %s

target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64"
target triple = "x86_64-apple-darwin10.0.0"

declare void @use_vec(<2 x i64>)

; Bitcasts between vectors and scalars are valid.
; PR4487
define i32 @test1(i64 %a) {
; CHECK-LABEL: @test1(
; CHECK-NEXT:    ret i32 0
;
  %t1 = bitcast i64 %a to <2 x i32>
  %t2 = bitcast i64 %a to <2 x i32>
  %t3 = xor <2 x i32> %t1, %t2
  %t4 = extractelement <2 x i32> %t3, i32 0
  ret i32 %t4
}

This file has been truncated. show original

>
> So we need something like this:
>

github.com

llvm/llvm-project/blob/main/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp

//===- InstCombineCasts.cpp -----------------------------------------------===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
//
// This file implements the visit functions for cast operations.
//
//===----------------------------------------------------------------------===//

#include "InstCombineInternal.h"
#include "llvm/ADT/SetVector.h"
#include "llvm/Analysis/ConstantFolding.h"
#include "llvm/IR/DataLayout.h"
#include "llvm/IR/DebugInfo.h"
#include "llvm/IR/PatternMatch.h"
#include "llvm/Support/KnownBits.h"
#include "llvm/Transforms/InstCombine/InstCombiner.h"

This file has been truncated. show original

>
>

Good, then I'll write a PR for the instcombine issue and broken test
and probably also put up a patch for review.

I think test61 in cast.ll (checking what happens when we
bitcast/zext/bitcast) is also broken, I think it inserts zeroes at
the
wrong end for big-endian targets so I'll include something about that
too.

As Björn mentioned we've seen a similar issue (PR44135) in the
DAGLegalizer so it seems vectors on big-endian machines aren't that
well tested, at least some code patterns aren't.

We triggered both these problems after enabling unrolling for our
target, so I suppose that resulted in some new code patterns. We'll
see
if anything more comes out of that.

Thanks,
Mikael

> > Hi,
> >
> > In
> > llvm/test/Transforms/InstCombine/cast.ll
> > there is a test like this:
> >
> > target datalayout = "E-p:64:64:64-p1:32:32:32-p2:64:64:64-
> > p3:64:64:64-
> > a0:0:8-f32:32:32-f64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-
> > i64:32:64-
> > v64:64:64-v128:128:128-n8:16:32:64"
> >
> > [...]
> >
> > define <3 x i32> @test60(<4 x i32> %call4) {
> > ; CHECK-LABEL: @test60(
> > ; CHECK-NEXT: [[P10:%.*]] = shufflevector <4 x i32>
> > [[CALL4:%.*]],
> > <4 x i32> undef, <3 x i32> <i32 0, i32 1, i32 2>
> > ; CHECK-NEXT: ret <3 x i32> [[P10]]
> > ;
> > %p11 = bitcast <4 x i32> %call4 to i128
> > %p9 = trunc i128 %p11 to i96
> > %p10 = bitcast i96 %p9 to <3 x i32>
> > ret <3 x i32> %p10
> >
> > }
> >
> > If we assume the input vector is e.g. <1, 2, 3, 4> then I assume
> > %p11
> > would be the (hex) value 1234, %p9 would be the 234 and %p10
> > would
> > then
> > be the vector <2, 3, 4>.
> >
> > Am I right, or am I missing something here? Note that the
> > datalayout
> > says we're using big endian.
> >
> > But the CHECK-NEXT checks that the result is made up of the
> > elements at
> > index 0, 1 and 2 from the input vector, which would be <1, 2, 3>.
> >
> > So, broken testcase or am I missing something?
> >
> > Thanks,
> > Mikael
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev@lists.llvm.org
> >

https://protect2.fireeye.com/v1/url?k=40f8f8fa-1c722c3b-40f8b861-863d9bcb726f-0ac67e884a2a0ba9&q=1&e=17c6716e-878a-4ce0-b201-b3607d52d122&u=https%3A%2F%2Flists.llvm.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fllvm-dev

_______________________________________________
LLVM Developers mailing list
llvm-dev@lists.llvm.org

https://protect2.fireeye.com/v1/url?k=2193cc3b-7d1918fa-21938ca0-863d9bcb726f-bf9bd0950bc61675&q=1&e=17c6716e-878a-4ce0-b201-b3607d52d122&u=https%3A%2F%2Flists.llvm.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fllvm-dev

Topic		Replies	Views
Vector cast LLVM Dev List Archives	3	72	June 22, 2008
Guidance around bitcast on vectors, sub-byte types and data layout IR & Optimizations llvm	0	122	September 15, 2023
possible regression regarding bitcasts? LLVM Dev List Archives	2	54	June 8, 2010
Bitcast between 2 different SDNode vector types LLVM Dev List Archives	1	101	May 26, 2017
Question on bit layout of array after bitcasting in llvm LLVM Dev List Archives	1	67	September 24, 2013

Instcombine and bitcast of vector. Wrong CHECKs in cast.ll, miscompile in instcombine?

Related Topics