Possible bug when using -O2/-O3 in clang 13 for ARMv7

Hi everyone,

This is kind of a follow-up to my previous email about compiling JSC using clang for ARMv7: when I build JSC using either -O2 or -O3, I get random garbage when querying for the “Infinity” constant from javascript, as if the constant was not being initialized. The variable is being initialized correctly, that I’m sure.

Some tests I did:

  1. Using -O1 or no optimization doesn’t trigger the issue.
  2. Using either -O2 or -O3 with address or the undef behavior sanitizers doesn’t trigger the issue.
  3. Building JSC with clang 11.0.1-2 (from Debian) and clang 12.0.1 (from github) doesn’t trigger the issue.
  4. The issue happens with clang 13.0.0 (from github) and the 13.0.1-rc1 (also from github).

It seems like some optimization introduced by -O2 is causing the issue.

Does anyone have any tips I can follow to improve this bug report? I’ll try to compile JSC with -O2 and disable the optimizations manually to pinpoint what’s causing the issue (hopefully it’s a single optimization and not a combination of them). Is there a flag in clang to print which optimizations are enabled for -O1 and -O2 so I can diff them?

I wish I had more information, but I’m still trying to debug why this is happening. I wanted to try to get more information first before opening a github issue.

Thanks in advance,

FWIW I think opening an issue with what you've got would be fine.

when I build JSC using either -O2 or -O3, I get random garbage when querying for the "Infinity" constant from javascript

Can you elaborate on what JSC is and how you do the query? Is it something like:
* build an interpreter
* interpret javascript code that prints infinity
* check for expected value

I know zero about javascript in general but if we can get a script to
do that then we could bisect it. It'll take a while but we (Linaro)
have access to some machines that could help there.
(assuming this presents on armv8 hardware, but if it doesn't it's at
least a data point)

It seems like some optimization introduced by -O2 is causing the issue.

Agreed

Is there a flag in clang to print which optimizations are enabled for -O1 and -O2 so I can diff them?

Yes but I can never remember which one it is, so let me try to find
it. Unless someone else knows it already and can reply.

Caveat that I'm on shaky ground when it comes to my understanding of
how the pass managers work.

I think clang 13 uses the new pass manager, so this would be what you want:
$ ./bin/clang /tmp/test.c -c -o /dev/null -O1 -Xclang -fdebug-pass-manager

Certainly I see a lot of differences trying that myself.
(Using the New Pass Manager — LLVM 16.0.0git documentation)

That tells you what the target generic passes being run are, then the
target specific things use the old pass manager which you can use
"-mllvm -debug-pass=Structure" for instead.
(Using the New Pass Manager — LLVM 16.0.0git documentation)

How you get an option from clang to opt to change which passes are
run, I couldn't work out. There is "-passes=" for opt but that won't
get through from clang with "-mllvm", maybe I just didn't do it
correctly.

Though I found a good suggestion from Medhi Amini in another thread,
that would mean you could use opt directly:
"But you may start by bisecting the input files compiling half of them
with O1 and the other half with O0 and recurse till you find the file
that is miscompiled with O1. Then you can use opt to test variant of
the pipeline on this file and relink manually."
(https://groups.google.com/g/llvm-dev/c/Y1HHgpmBidM)

Hope that helps but in any case please go ahead and open an issue and
we can narrow it down further.

Hi,

FWIW I think opening an issue with what you’ve got would be fine.

Cool, I’ll try to add the information here and open the github issue.

when I build JSC using either -O2 or -O3, I get random garbage when querying for the “Infinity” constant from javascript

Can you elaborate on what JSC is and how you do the query? Is it something like:

  • build an interpreter
  • interpret javascript code that prints infinity
  • check for expected value

Right, so to build JSC, you need to get WebKit from https://github.com/WebKit/WebKit (shallow clone is a friend here), and run:

$ ./Tools/Scripts/build-jsc --Release --jsc-only ‘–cmakeargs=-DCMAKE_CXX_COMPILER=/bin/clang++ -DCMAKE_C_COMPILER=/bin/clang’

Release by default builds with -O3 -DNDEBUG. JSC will be built in WebKitBuild/Release/bin/jsc.

To build the debug version, you must replace --release by --debug, and JSC will be built in WebKitBuild/Debug/bin/jsc. To rebuild, you can either remove the WebKitBuild dir, or go in WebKitBuild/Release/ and do a ninja clean + ninja.

The program I’m using is:

$ cat foo.js
print(Infinity)
let a = Infinity / Infinity
print(Number.isNaN(a))

JSC built in release mode (the value infinity changes every time):

$ WebKitBuild/Release/bin/jsc foo.js
-1.1089394371691584e+269
false

Expected output:
$ ./WebKitBuild/Debug/bin/jsc foo.js
Infinity
true

I know zero about javascript in general but if we can get a script to
do that then we could bisect it. It’ll take a while but we (Linaro)
have access to some machines that could help there.
(assuming this presents on armv8 hardware, but if it doesn’t it’s at
least a data point)

I’m using:

$ uname -a
Linux bbox-11-armhf 5.10.0-0.bpo.7-arm64 #1 SMP Debian 5.10.40-1~bpo10+1 (2021-06-04) armv8l GNU/Linux

Caveat that I’m on shaky ground when it comes to my understanding of
how the pass managers work.

I think clang 13 uses the new pass manager, so this would be what you want:
$ ./bin/clang /tmp/test.c -c -o /dev/null -O1 -Xclang -fdebug-pass-manager

Someone suggested looking into Alias Analysis as well.

Certainly I see a lot of differences trying that myself.
(https://llvm.org/docs/NewPassManager.html#invoking-opt)

That tells you what the target generic passes being run are, then the
target specific things use the old pass manager which you can use
“-mllvm -debug-pass=Structure” for instead.
(https://llvm.org/docs/NewPassManager.html#status-of-the-new-and-legacy-pass-managers)

How you get an option from clang to opt to change which passes are
run, I couldn’t work out. There is “-passes=” for opt but that won’t
get through from clang with “-mllvm”, maybe I just didn’t do it
correctly.

Though I found a good suggestion from Medhi Amini in another thread,
that would mean you could use opt directly:
“But you may start by bisecting the input files compiling half of them
with O1 and the other half with O0 and recurse till you find the file
that is miscompiled with O1. Then you can use opt to test variant of
the pipeline on this file and relink manually.”
(https://groups.google.com/g/llvm-dev/c/Y1HHgpmBidM)

I see, but I was thinking about something simpler: adding -O2 and -fno-foo flags to disable optimizations (e.g., -fno-slp-vectorizer), WDYT?

I’ve found this post from '14 about that: https://stackoverflow.com/questions/15548023/clang-optimization-levels

Hope that helps but in any case please go ahead and open an issue and
we can narrow it down further.

Done: https://github.com/llvm/llvm-project/issues/52669

I see, but I was thinking about something simpler: adding -O2 and -fno-foo flags to disable optimizations (e.g., -fno-slp-vectorizer), WDYT?

Worth a go for sure. I wasn't sure if all the passes would have an
-fno for them but if it's one of those then all the better.

Done: https://github.com/llvm/llvm-project/issues/52669

Great. I'll see if I can reproduce it myself.

I see, but I was thinking about something simpler: adding -O2 and -fno-foo flags to disable optimizations (e.g., -fno-slp-vectorizer), WDYT?

Worth a go for sure. I wasn’t sure if all the passes would have an
-fno for them but if it’s one of those then all the better.

Didn’t work :confused: Apparently -O2 only adds -vectorize-slp and -vectorize-loops and disabling both didn’t solve the problem.