Test failure of sparse_sign test in Apple Silicon

Lewuathe · March 22, 2024, 6:38am

As discussed in this issue, we have found the platform running on Apple Silicon did not seem to support the negative nan. That causes the test failure when we check the output of the nan signedness by running the test with the CPU runner.

The function used in the CRunnerUtil always print nan in Apple Silicon platform.

extern "C" void printF32(float f) { fprintf(stdout, "%g", f); }

We have fixed the issue happening in the complex dialect not to check the signess of the nan value to make it platform agnostic.

We have found the same situation happened in the test of sparse dialect.

    //
    // Verify the results.
    //
    // CHECK:      ---- Sparse Tensor ----
    // CHECK-NEXT: nse = 12
    // CHECK-NEXT: dim = ( 32 )
    // CHECK-NEXT: lvl = ( 32 )
    // CHECK-NEXT: pos[0] : ( 0, 12
    // CHECK-NEXT: crd[0] : ( 0, 3, 5, 11, 13, 17, 18, 20, 21, 28, 29, 31
    // CHECK-NEXT: values : ( -1, 1, -1, 1, 1, -1, nan, -nan, 1, -1, -0, 0
    // CHECK-NEXT: ----
    //

The test result in Apple Silicon is as follows. We do not get -nan.

---- Sparse Tensor ----
nse = 12
dim = ( 32 )
lvl = ( 32 )
pos[0] : ( 0, 12,  )
crd[0] : ( 0, 3, 5, 11, 13, 17, 18, 20, 21, 28, 29, 31,  )
values : ( -1, 1, -1, 1, 1, -1, nan, nan, 1, -1, -0, 0,  )
----

The test explicitly specifies the bit pattern for negative nan, which makes the test platform dependent.

What is the desired approach to fix this test in Apple Silicon? The options I have come up with are:

Excluding aarch64 architecture with specifying XFAIL directive. (But this approach may be too broad considering the other aarch64 architecture seems to support negative nan.)
Exclude the spec from running on Apple Silicon (possible?)
Move the test under the platform specific folder if it assumes some platform dependency. (e.g. mlir/test/Integration/Dialect /SparseTensor/CPU/X86)

Do you have any thoughts on this?

mehdi_amini · March 22, 2024, 7:08am

Can we fix this runtime function as follow?

extern "C" void printF32(float f) { 
  if (std::isnan(f) && std::signbit(nan)) {
     fprintf(stdout, "-nan", f);
  }
  fprintf(stdout, "%g", f);
}

(assuming this is a libc issue and not a HW issue of course)

Lewuathe · March 25, 2024, 5:47am

Thanks! I tried to run the following code on my Apple Silicon machine and got the expected output. Using the standard library to detect the signs of the nan value is the possible way to make the test platform agnostic. I’m going to try to update the MLIR C runner util too.

#include <stdio.h>
#include <cmath>


void printF32(float f)
{
  if (std::isnan(f) && std::signbit(f)) {
    fprintf(stdout, "-nan\n");
  } else {
    fprintf(stdout, "%g\n", f);
  }
}

int main()
{
  const float nan = 0.0/0.0;
  printF32(nan);

  const float neg_nan = -nan;
  printF32(neg_nan);

  const float small_value = 5.96046e-08;
  printF32(small_value);
}

./a.out
nan
-nan
5.96046e-08

banach-space · March 28, 2024, 8:36am

A bit late to the party

Hey @Lewuathe, I just wanted to say thank you for pushing on this - that’s really appreciated! And thanks to @mehdi_amini for proposing such a neat solution

-Andrzej

Topic		Replies	Views
[PATCH 1/2] amdgcn/fmin: Explicitly check for NaNs OpenCL	17	212	March 3, 2018
signbit returns false for negative NaN Clang Frontend	10	80	June 8, 2014
Signed NaNs in APFloat arithmetic LLVM Dev List Archives	18	73	August 8, 2014
Clang-Cl - Representation of NAN; Code to reproduce Clang Frontend	4	135	February 6, 2019
New sparse_tensor.print operation MLIR sparse	0	127	February 28, 2024

Test failure of sparse_sign test in Apple Silicon

Related Topics