Clang9 UBSan and GMP

[Please cc me, as I am not on the list.]

When compiling GMP 6.1.2 with MacPorts clang9 on MacOS 10.15, the check gives one error, but if turning on UBSan ‘undefined’, it passes without a report. Should it not report what it thinks is the issue?

You mentioned “the check gives one error” - which check?

The ‘make check’ of GMP 6.1.2. One of the tests fail, but with any UBSan ‘undefined’ option enabled (in ‘configure’), none.

Hard to know what might be happening - what sort of failure you’re seeing, etc. Perhaps UBSan is stabilizing/changing unspecified rather than undefined behavior - or the test is failing due to some undefined behavior that UBSan doesn’t catch, etc.

It is just an abort trap. The ‘make check' also passes if turning off optimization. I would have expected UBSan to not change anything in optimization, but merely report the issues it finds. Apparently it finds nothing, so it may suggest a compiler bug.

UBSan adds code to check things, it necessarily changes optimizations by having those checks in. It shouldn’t affect the behavior of programs that don’t exhibit UB (but I imagine it could affect the behavior of programs relying on specific IB (Implementation Defined Behavior)).

Reducing a test case to better understand your code & be able to portably demonstrate the issue would help to get traction on understanding the behavior, seeing if it’s a clang bug, etc.

So then there probably is an issue with the optimization.

Just run 'gmp-6.1.2’ with MacPorts clang 9.0.0; I got:
../../../gmp-6.1.2/test-driver: line 107: 70037 Abort trap: 6 "$@" > $log_file 2>&1
FAIL: t-sqrlo

With Apple clang 11.0.0 (clang-1100.0.33.8), I got another ‘make check’ error, also only when using optimization:
../../../gmp-6.1.2/test-driver: line 107: 30266 Segmentation fault: 11 "$@" > $log_file 2>&1
FAIL: t-toom53

It’s pretty hard to conclude whether it’s a bug in your code or in the compiler, or both, without narrowing down a test case.

It is not my code, it belongs to gmp-6.1.2, I merely happened to come a cross it. It passes on gcc9, so there is something that clang9 does.

It’s hard to know if it’s the compiler’s fault without a test case - due to the nature of undefined behavior and other things (implementation defined behavior and unspecified behavior) in C++, that the program behaves as expected with another compiler or another set of flags doesn’t give a strong indication as to where the problem is (in the code, in one of the compilers, etc).

  • Dave

The sources are available at [1]; it is written in C, not C++. I was was hoping that that something like UBSan would shed light on it, but the original question is answered: it changes optimization. The GMP developers say that they have caught some compiler bugs, but that is hard to do and time consuming.

1. https://gmplib.org

Yeah, coming across compiler bugs does happen - but more often it’s bugs in input programs. (one of the reasons compiler engineers aren’t likely to jump on reproducing and reducing misbehaving programs, because on the odds, it’s not a bug in the compiler)

That is the reason I tried the UBSan, but as it changes optimization, it does not wrok.

UBSan doesn’t catch everything - you could also try ASan and/or valgrind, etc. (MSan if you want, but that’s a bit fussier/more work to use)

The GMP developers felt it was a compiler bug, so I think I will leave it at that. But thanks for the tips.

Hans, it’s challenging to give sensible advice/guesses without knowing which test is failing. Maybe I missed this information in the replies (please CC the list if you want follow up answers from more than just David). I am not a GMP developer, but note that GMP is regularly tested with ubsan and results are included at https://gmplib.org/devel/tm/gmp/date.html.

Hans, it’s challenging to give sensible advice/guesses without knowing which test is failing. Maybe I missed this information in the replies (please CC the list if you want follow up answers from more than just David).

That’s my fault - Hans not subscribed to the list, so the emails have to be approved by a moderator (me) & I hadn’t gotten around to it. Approved them all so they should now show up.

I ran the test & understand it a bit better now - so the abort is part of the code, when the test fails, the test harness uses abort to fail.

So this isn’t “clang causes abort” (it didn’t select a bad instruction, etc) this is “clang causes test failure” - this could be any number of things in terms of compiler optimizations due to overly dependent code (or due to miscompiles, to be sure). It’s possible the test relies on specific numeric results that the C++ programming language doesn’t guarantee (either overly high precision, or overly low precision - C++ allows extra precision in computations which can break numerical code that’s relying on certain precision, for instance).

So, yeah, really hard to say where the fault lies without further investigation.

Indeed, very hard to figure out. If it is some hidden undefined behavior causing it, the UBSan should have caught it, but it does not. The link that Matthew gave says that the GMP developers experienced a number of such issues with Clang. One can though turn off the optimizer, and the tests pass.

Indeed, very hard to figure out. If it is some hidden undefined behavior causing it, the UBSan should have caught it, but it does not.

Right - but especially with numerics (especially floating point) there’s loads of room for valid but different behavior between different compilers - behavior that isn’t UB. How much precision a certain mathematical equation maintains is really at the whim of the optimizers in a lot of ways.

The link that Matthew gave says that the GMP developers experienced a number of such issues with Clang. One can though turn off the optimizer, and the tests pass.

Sure - most of the numeric effects would only appear with optimizations. Without them every numeric operation’s just done in registers, then written right back to memory (so no chance of excess precision leaking in by storing the value in an 80bit floating point register between multiple operations, or any risk of fused operations that produces extra precision, etc).

The only way to know is to trace down/reduce the point where the values diverge & stare at the code to see who’s right.

  • Dave