Is SafeStack coloring safe to enable?

Since PR32143 (https://github.com/llvm/llvm-project/issues/31491) we’ve disabled using stack coloring for the unsafe stack by default, due to reported miscompilations. Is the stack-coloring optimization safe to use, given that report?

Fuchsia would like to be able to use this optimization, since it compiles with SafeStack by default. But we are not sure if it is safe to do so, given that the optimization is disabled by default in SafeStack.cpp due to this bug .

If the optimization is safe to use, then how do we vet that the original miscompilations are no longer an issue? Do we have guidelines for re-enabling optimizations after they cause miscompilation?

For context, the original report was from 2017. The compiler has had a significant amount of development in the intervening 5 years. I am unable to produce the miscompilations using the reproducer provided in the original report with clang releases going back until at least clang-9 (i.e., compiling with safe stack and safe stack coloring). I only stopped at clang-9 because I would have to change my development environment, so it may fail to reproduce with even older versions of clang. I also looked through the SafeStack tests, and didn’t find anything related to this issue either. I don’t feel confident saying that proves the optimization is now safe, but I’m unsure how to confirm it would still be a problem for the compiler today.

If the test cases work now, it seems like it should be safe to turn back on. It would be nice to know which change caused them to pass, though.

I can try out some older versions. Is this how you were trying to reproduce the failures:
clang -fsanitize=safe-stack -mllvm --safe-stack-coloring -O2 test.c

Thank you. It bothered me that I had incomplete info, so I went and repeated my testing going back to clang-4 on an Intel Xeon, and AMD Ryzen 7 with identical results. The bug appears in clang-4 to clang-7, but stops manifesting in clang-8. I still think it would be good to get confirmation on another system.

Bisecting may be our only recourse here, though that would take a significant amount of time. I have a suspicion that its a combination of transforms interacting rather than something specific in the coloring algorithm. I didn’t see anything that stood out in the blame list for the SafeStack passes, but my quick look was hardly comprehensive. If identifying the change is a blocker I can start bisecting today.

Yes that is how I tested the reproducer. I also tried out -O3 but the behavior seems to be the same.

Something like this could speed up bisecting significantly elfshaker/manyclangs: Repository hosting unofficial binary pack files for many commits of LLVM (github.com)
Although unfortunately it appears to start at clang-8.

1 Like

I can confirm this. I tested on older RHEL releases and the test case was failing with clang-7, but passing with clang-8.

I can try to bisect to find the actual commit that caused the test to pass.

Thanks for pointing this out. I realized we probably have some old artifacts in Fuchsia’s CI infrastructure I could use in a similar (if cumbersome) fashion.

I was able to narrow it down to commits between 105a3662546d8ec92641ef3a8dbebc51f954f2f8 and 4c9fa4a0a137c250ecb977ebed5577cc131e94e6. I’m not sure I’ll have much more time to dig into this until next week, but this at least narrows it down to a handful of changes.

There are a few interesting commits in that range. The changes to DAGCombiner seem likely, but one of the changes to GVN also seems like it could be related.

I bisected it down to this commit: e4c91f5c4c52cd36b8238d29b933f2866ea0d735

Based on what this commit does, it makes me think that the bug is still there but just hidden. I think we need more analysis of what was actually causing the crash in the first place.

safe-stack-coloring-bug.txt (23.4 KB)

I spent some time investigating this issue, and I believe the root cause has been fixed now in the main branch. Maybe someone with more knowledge of this pass can double check my analysis:

I’ve uploaded an LLVM IR tests case (with .txt extension to get around Discourse file-type rules) which can be used to reproduce the issue. I believe the problem is that the SafeStackColoring pass was miscalculating the live range of this alloca %temp.i = alloca [32 x i8], align 16 The pass was not tracking live ranges through PHI instructions and was missing the uses of this alloca in the if.end block

The best way to observe this is to build commit 64ad0ad5ed92fc9881075cffebce418d05f61288 and run:
opt --safe-stack-coloring -debug-only=safestackcoloring -safe-stack safe-stack-coloring-bug.txt -S -o /dev/null

When I run this test case with an opt build from the latest main branch, I can see that the live range looks to be correctly calculated:
opt -debug-only=stack-lifetime --safe-stack-coloring -debug-only=safestackcoloring -safe-stack safe-stack-coloring-bug.txt -S -o /dev/null

From my perspective this bug is fixed, and I think we can enable the pass in main now.

I’ve submitted a patch to re-enable this: ⚙ D120866 SafeStack: Re-enable SafeStack coloring optimization.

Thank you so much for your help with this.