Problem with clang optimizer?

I’m not sure if this is the correct list, so please direct me to the right one if this bug report shouldn’t go here.

The problem is: invoking clang (v12) with -O2 or better optimization flags generates wrong object code for the following C++. Compiling it with -O1 generates working binary.

Hi Uri,

I tried to reproduce it on goldbolt with clang 12.0.0 and 12.0.1 but things seem fine when I run it there: https://godbolt.org/z/vrq8j6Kj7.
Can you share your exact clang invocation? Does it only reproduce in some specific environment?

Also, it generally helps to reduce code bug reports as much as possible; creduce can help with that: https://embed.cs.utah.edu/creduce/using/.

-Jakub

Reproducer: Compiler Explorer
(cstddef and -march must be added as well)

Michael

opt-bisect points to SLP vectorizer. And it looks like it doesn’t fail on trunk.

Also, it seems to impact only MacOS? One colleague wasn’t able to reproduce with Clang-12 on Linux.

This is where this problem was discovered and is tracked, with all the details:

https://github.com/randombit/botan/issues/2802

If you nail this one, it would be great.

Thanks!

I tried to reproduce it on goldbolt with clang 12.0.0 and 12.0.1 but things seem fine when I run it there: https://godbolt.org/z/vrq8j6Kj7.
Can you share your exact clang invocation? Does it only reproduce in some specific environment?

Save the source I posted before into “sha3-reproducer.cxx” file. Let me know if you want it re-posted here.

$ clang+±mp-12 -v

clang version 12.0.1

Target: x86_64-apple-darwin20.6.0

Thread model: posix

InstalledDir: /opt/local/libexec/llvm-12/bin

$ clang+±mp-12 -o s -O3 sha3-reproducer.cxx

$ ./s

Assertion failed: (T[0] == 16394434931424703552u), function main, file sha3-reproducer.cxx, line 103.

Abort trap: 6

$ clang+±mp-12 -o s -O2 sha3-reproducer.cxx

$ ./s

Assertion failed: (T[0] == 16394434931424703552u), function main, file sha3-reproducer.cxx, line 103.

Abort trap: 6

$ clang+±mp-12 -o s -O1 sha3-reproducer.cxx

$ ./s

$

Clang-12 is installed via Macports, which is why we invoke the executable as clang+±mp-12.

The same problem manifests in exactly the same way in the Xcode-13 version of Clang (presumably based on LLVM Clang-12).

I’ll be happy to provide more of specific details, if you let me know what you need.

Also, it generally helps to reduce code bug reports as much as possible; creduce can help with that: https://embed.cs.utah.edu/creduce/using/.

Understood. Unfortunately, the above reproducer is the best we could come up with. An alternative is trying to build the Botan package itself https://github.com/randombit/botan.git.

It is only occurring (as far as I can see now) on x86_64, with -mavx enabled. Or with a target CPU that supports AVX. And it is not Apple clang specific.

-Dimitry

It reproduced for me with -march=nehalem which does not have AVX.

I found that

  • The problem disappears with -mno-sse4.1
  • The problem manifests with both Apple Clang from Xcode-13, and LLVM Clang-12 (and not with Xcode-12 or LLVM Clang-11)
  • I could experiment only on Apple platform, as that’s the only one I have that runs LLVM Clang-12.

Looking at the IR here https://godbolt.org/z/zaMW1renW I believe the issue is on this instruction on line 361

%30 = extractelement <2 x <2 x i64>*> %bc438, i32 0

It should be extracting from index 1 instead of index 0.

This may be fixed now (https://reviews.llvm.org/D106613), but it remains to be confirmed for https://bugs.llvm.org/show_bug.cgi?id=51957

I just tried Clang-13 (with LLVM-13), and the problem is still there. Vectorizer still broken wrt. SSE-4.1 instruction extensions:

$ echo $CXXFLAGS

-std=gnu++17 -O3 -march=native -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk

$ clang+±mp-13 $CXXFLAGS -o t sha3-reproducer.cxx

$ ./t

Assertion failed: (T[0] == 16394434931424703552u), function main, file sha3-reproducer.cxx, line 103.

Abort trap: 6

$ clang+±mp-13 $CXXFLAGS -mno-sse4.1 -o t sha3-reproducer.cxx

$ ./t

$

Hi Uri,

Unfortunately the fix for this didn't make into 13.0.0, and will hopefully be part of 13.0.1 (when that comes out I can't say though).

-Dimitry

Hi Uri,
could you please reproduce this at godbolt.org? AFAIK, this issue is veiled at clang-13, though real fix isn’t backported (see Dimitry’s comment: https://bugs.llvm.org/show_bug.cgi?id=51957#c7).
I can’t reproduce it on clang-13: https://godbolt.org/z/4Mdrd5388
Thanks,
Anton

вт, 26 окт. 2021 г. в 00:35, Blumenthal, Uri - 0553 - MITLL via llvm-dev <llvm-dev@lists.llvm.org>:

could you please reproduce this at godbolt.org?

Shows up with 12.0.0 and 12.0.1, but not on 13.0.0 on that site. But there seem to be some issues with that site – see below.

AFAIK, this issue is veiled at clang-13,

I’m not sure. First, on my machines – it shows dependence on CPU (fails on Skylake, passes on Skylake-avx512).

$ clang+±mp-13 -v -O3 -march=native -o t sha3-reproducer.cxx

clang version 13.0.0

Target: x86_64-apple-darwin20.6.0

Thread model: posix

InstalledDir: /opt/local/libexec/llvm-13/bin

“/opt/local/libexec/llvm-13/bin/clang” -cc1 -triple x86_64-apple-macosx11.0.0 -Wundef-prefix=TARGET_OS_ -Werror=undef-prefix -Wdeprecated-objc-isa-usage -Werror=deprecated-objc-isa-usage -emit-obj --mrelax-relocations -disable-free -disable-llvm-verifier -discard-value-names -main-file-name sha3-reproducer.cxx -mrelocation-model pic -pic-level 2 -mframe-pointer=all -fno-rounding-math -munwind-tables -target-sdk-version=11.3 -fcompatibility-qualified-id-block-type-checking -fvisibility-inlines-hidden-static-local-var -target-cpu skylake -target-feature +sse2 -target-feature -tsxldtrk -target-feature +cx16 -target-feature +sahf -target-feature -tbm -target-feature -avx512ifma -target-feature -sha -target-feature -gfni -target-feature -fma4 -target-feature -vpclmulqdq -target-feature +prfchw -target-feature +bmi2 -target-feature -cldemote -target-feature +fsgsbase -target-feature -ptwrite -target-feature -amx-tile -target-feature -uintr -target-feature +popcnt -target-feature -widekl -target-feature +aes -target-feature -avx512bitalg -target-feature -movdiri -target-feature +xsaves -target-feature -avx512er -target-feature -avxvnni -target-feature -avx512vnni -target-feature -amx-bf16 -target-feature -avx512vpopcntdq -target-feature -pconfig -target-feature -clwb -target-feature -avx512f -target-feature +xsavec -target-feature -clzero -target-feature -pku -target-feature +mmx -target-feature -lwp -target-feature -rdpid -target-feature -xop -target-feature +rdseed -target-feature -waitpkg -target-feature -kl -target-feature -movdir64b -target-feature -sse4a -target-feature -avx512bw -target-feature +clflushopt -target-feature +xsave -target-feature -avx512vbmi2 -target-feature +64bit -target-feature -avx512vl -target-feature -serialize -target-feature -hreset -target-feature +invpcid -target-feature -avx512cd -target-feature +avx -target-feature -vaes -target-feature -avx512bf16 -target-feature +cx8 -target-feature +fma -target-feature -rtm -target-feature +bmi -target-feature -enqcmd -target-feature +rdrnd -target-feature -mwaitx -target-feature +sse4.1 -target-feature +sse4.2 -target-feature +avx2 -target-feature +fxsr -target-feature -wbnoinvd -target-feature +sse -target-feature +lzcnt -target-feature +pclmul -target-feature -prefetchwt1 -target-feature +f16c -target-feature +ssse3 -target-feature +sgx -target-feature -shstk -target-feature +cmov -target-feature -avx512vbmi -target-feature -amx-int8 -target-feature +movbe -target-feature -avx512vp2intersect -target-feature +xsaveopt -target-feature -avx512dq -target-feature +adx -target-feature -avx512pf -target-feature +sse3 -debugger-tuning=lldb -target-linker-version 650.9 -v -fcoverage-compilation-dir=/Users/ur20980/src -resource-dir /opt/local/libexec/llvm-13/lib/clang/13.0.0 -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk -I/usr/local/include -stdlib=libc++ -internal-isystem /opt/local/libexec/llvm-13/bin/…/include/c++/v1 -internal-isystem /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/local/include -internal-isystem /opt/local/libexec/llvm-13/lib/clang/13.0.0/include -internal-externc-isystem /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include -O3 -fdeprecated-macro -fdebug-compilation-dir=/Users/ur20980/src -ferror-limit 19 -stack-protector 1 -fblocks -fencode-extended-block-signature -fregister-global-dtors-with-atexit -fgnuc-version=4.2.1 -fcxx-exceptions -fexceptions -fmax-type-align=16 -fcolor-diagnostics -vectorize-loops -vectorize-slp -D__GCC_HAVE_DWARF2_CFI_ASM=1 -o /var/folders/_l/4q83bg9j5ysb7qd1n9xpnb4h0000gn/T/sha3-reproducer-beadcf.o -x c++ sha3-reproducer.cxx

clang -cc1 version 13.0.0 based upon LLVM 13.0.0 default target x86_64-apple-darwin20.6.0

ignoring nonexistent directory “/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/local/include”

ignoring nonexistent directory “/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/Library/Frameworks”

#include “…” search starts here:

#include <…> search starts here:

/usr/local/include

/opt/local/libexec/llvm-13/bin/…/include/c++/v1

/opt/local/libexec/llvm-13/lib/clang/13.0.0/include

/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include

/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/System/Library/Frameworks (framework directory)

End of search list.

“/opt/local/libexec/llvm-13/bin/ld” -demangle -lto_library /opt/local/libexec/llvm-13/lib/libLTO.dylib -dynamic -arch x86_64 -platform_version macos 11.0.0 11.3 -syslibroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk -o t -L/usr/local/lib /var/folders/_l/4q83bg9j5ysb7qd1n9xpnb4h0000gn/T/sha3-reproducer-beadcf.o -lc++ -lSystem /opt/local/libexec/llvm-13/lib/clang/13.0.0/lib/darwin/libclang_rt.osx.a

$ ./t

Assertion failed: (T[0] == 16394434931424703552u), function main, file sha3-reproducer.cxx, line 104.

Abort trap: 6

$

though real fix isn’t backported (see Dimitry’s comment: https://bugs.llvm.org/show_bug.cgi?id=51957#c7).

See above – fails on Skylake.

And here, on Skylake-avx512 it seems to pass:

$ clang+±mp-13 -v $CXXFLAGS -O3 -march=native -o t sha3-reproducer.cxx

clang version 13.0.0

Target: x86_64-apple-darwin20.6.0

Thread model: posix

InstalledDir: /opt/local/libexec/llvm-13/bin

“/opt/local/libexec/llvm-13/bin/clang” -cc1 -triple x86_64-apple-macosx11.0.0 -Wundef-prefix=TARGET_OS_ -Werror=undef-prefix -Wdeprecated-objc-isa-usage -Werror=deprecated-objc-isa-usage -emit-obj --mrelax-relocations -disable-free -disable-llvm-verifier -discard-value-names -main-file-name sha3-reproducer.cxx -mrelocation-model pic -pic-level 2 -mframe-pointer=all -fno-rounding-math -munwind-tables -target-sdk-version=12.0 -fcompatibility-qualified-id-block-type-checking -fvisibility-inlines-hidden-static-local-var -target-cpu skylake-avx512 -target-feature +sse2 -target-feature -tsxldtrk -target-feature +cx16 -target-feature +sahf -target-feature -tbm -target-feature -avx512ifma -target-feature -sha -target-feature -gfni -target-feature -fma4 -target-feature -vpclmulqdq -target-feature +prfchw -target-feature +bmi2 -target-feature -cldemote -target-feature +fsgsbase -target-feature -ptwrite -target-feature -amx-tile -target-feature -uintr -target-feature +popcnt -target-feature -widekl -target-feature +aes -target-feature -avx512bitalg -target-feature -movdiri -target-feature +xsaves -target-feature -avx512er -target-feature -avxvnni -target-feature -avx512vnni -target-feature -amx-bf16 -target-feature -avx512vpopcntdq -target-feature -pconfig -target-feature +clwb -target-feature +avx512f -target-feature +xsavec -target-feature -clzero -target-feature -pku -target-feature +mmx -target-feature -lwp -target-feature -rdpid -target-feature -xop -target-feature +rdseed -target-feature -waitpkg -target-feature -kl -target-feature -movdir64b -target-feature -sse4a -target-feature +avx512bw -target-feature +clflushopt -target-feature +xsave -target-feature -avx512vbmi2 -target-feature +64bit -target-feature +avx512vl -target-feature -serialize -target-feature -hreset -target-feature +invpcid -target-feature +avx512cd -target-feature +avx -target-feature -vaes -target-feature -avx512bf16 -target-feature +cx8 -target-feature +fma -target-feature +rtm -target-feature +bmi -target-feature -enqcmd -target-feature +rdrnd -target-feature -mwaitx -target-feature +sse4.1 -target-feature +sse4.2 -target-feature +avx2 -target-feature +fxsr -target-feature -wbnoinvd -target-feature +sse -target-feature +lzcnt -target-feature +pclmul -target-feature -prefetchwt1 -target-feature +f16c -target-feature +ssse3 -target-feature -sgx -target-feature -shstk -target-feature +cmov -target-feature -avx512vbmi -target-feature -amx-int8 -target-feature +movbe -target-feature -avx512vp2intersect -target-feature +xsaveopt -target-feature +avx512dq -target-feature +adx -target-feature -avx512pf -target-feature +sse3 -debugger-tuning=lldb -target-linker-version 650.9 -v -fcoverage-compilation-dir=/Users/ur20980/src -resource-dir /opt/local/libexec/llvm-13/lib/clang/13.0.0 -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk -stdlib=libc++ -internal-isystem /opt/local/libexec/llvm-13/bin/…/include/c++/v1 -internal-isystem /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/local/include -internal-isystem /opt/local/libexec/llvm-13/lib/clang/13.0.0/include -internal-externc-isystem /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include -O3 -std=gnu++17 -fdeprecated-macro -fdebug-compilation-dir=/Users/ur20980/src -ferror-limit 19 -stack-protector 1 -fblocks -fencode-extended-block-signature -fregister-global-dtors-with-atexit -fgnuc-version=4.2.1 -fcxx-exceptions -fexceptions -fmax-type-align=16 -fcolor-diagnostics -vectorize-loops -vectorize-slp -D__GCC_HAVE_DWARF2_CFI_ASM=1 -o /var/folders/c6/lnc_0m093ys8w16md_fm1mnxhtfnj8/T/sha3-reproducer-5022a4.o -x c++ sha3-reproducer.cxx

clang -cc1 version 13.0.0 based upon LLVM 13.0.0 default target x86_64-apple-darwin20.6.0

ignoring nonexistent directory “/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/local/include”

ignoring nonexistent directory “/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/Library/Frameworks”

#include “…” search starts here:

#include <…> search starts here:

/opt/local/libexec/llvm-13/bin/…/include/c++/v1

/opt/local/libexec/llvm-13/lib/clang/13.0.0/include

/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include

/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/System/Library/Frameworks (framework directory)

End of search list.

“/opt/local/libexec/llvm-13/bin/ld” -demangle -lto_library /opt/local/libexec/llvm-13/lib/libLTO.dylib -dynamic -arch x86_64 -platform_version macos 11.0.0 12.0 -syslibroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk -o t /var/folders/c6/lnc_0m093ys8w16md_fm1mnxhtfnj8/T/sha3-reproducer-5022a4.o -lc++ -lSystem /opt/local/libexec/llvm-13/lib/clang/13.0.0/lib/darwin/libclang_rt.osx.a

$ ./t

$

I can’t reproduce it on clang-13: https://godbolt.org/z/4Mdrd5388

I can’t reproduce it on clang-13 on Godbolt, but unfortunately, it’s 100% consistent on my machines. Also, I’m not certain the Godbolt site uses correct compiler. Here’s what it shows me:

image002.png

So, requested Clang, but compiled with GCC???

вт, 26 окт. 2021 г. в 00:35, Blumenthal, Uri - 0553 - MITLL via llvm-dev <llvm-dev@lists.llvm.org>:

Ok, I found that confusion comes from the different binaries supplied as “version 13”, see below.

Now I am confused.

Does it mean that distributions (Ubuntu on Linux, Macports on MacOS, etc.) took pre-release (aka, still-beta) LLVM/Clang and released it as Clang-13?

Or that LLVM or Clang were patched after being released, without updating the version numbers?

So, requested Clang, but compiled with GCC???

Godbold actually uses Clang here, “-gcc-toolchain” is just its option to provide GCC installation.

I do not understand the above – that “checkbox-button” seems to be named “All compilation options” for the currently selected compiler/toolchain…?

On the other hand, I’m probably not the main customer of that tool, so my understanding is not crucial. :wink:

Yes, I’ve managed to reproduce it too on my ubuntu box with “clang-13” package installed: . . . . .

But that’s not the clang-13 used by me and godbold, tagged as llvmorg-13.0.0 and announced on 4 October 2021. The ubuntu-packaged clang-13 uses e5f2898bc751 tip commit pushed on 27 March 2021, so it’s an older version.

I hear you – but what about Mac? Macports Clang-13.0.0 was released only a couple of days ago. I don’t know how to check its commit level, but I strongly doubt it’s as dated as, e.g., Ubuntu release…

If you want to get subject issue fixed, you can use clang-13 from here: https://github.com/llvm/llvm-project/releases/tag/llvmorg-13.0.0, for instance, here is apple-darwin version: https://github.com/llvm/llvm-project/releases/download/llvmorg-13.0.0/clang+llvm-13.0.0-x86_64-apple-darwin.tar.xz

I do, but I much prefer that Macports maintains “big” packages on my machines, and LLVM-Clang definitely qualify. Thus, I’d rather not track llvm-project on GitHub myself.

Can I hope that the fix Dimitry mentioned, will be in 13.0.1 that will (hopefully, eventually) filter downstream?

Hi Uri,
sorry for confusing you! I didn’t tackle this enough (and had an old strange clang-13 installation on a weird box).
Your issue is actually reproduced on godbolt with “-march=skylake” option: https://godbolt.org/z/jaMP3W1cn .
That means this bug is veiled only in particular. As Dimitry said, the next clang-13.0.1 release should grasp the whole fix.
Thanks,
Anton

вт, 26 окт. 2021 г. в 16:40, Blumenthal, Uri - 0553 - MITLL <uri@ll.mit.edu>: