Hi all,
Debugging a mis-compilation is always time consuming. I recently did some attempt on bisecting bad pass for LLVM and would like to share some ideas about how do we make it work. And meanwhile, I would also encourage the community to make each pass more bisectable with help of DebugCounter.
We have already got a very useful helper in LLVM for pass level bisection, which is OptBisect. Though it only works for legacy pass manager, Fedor proposed an on-going effort to make similar things work for new pass manager.
To bring it to the next level, DebugCounter provides features for me to have an in-pass (transformation) level limit to tell which transform in the pass exactly caused the error. When we set StopAfter value for a DebugCounter, it will eventually stop there as a limit.
And in D50031 and rL337748, I added a method to print DebugCounter info: the -print-debug-counter
flag. With this, writing a transformation level bisection script will be more straightforward.
The issue we face is that the transformation level bisection can only work in passes with DebugCounters, and very few passes have these today. DebugCounter is also very useful even a pass author debugs manually without special bisection tooling. So I would encourage the community to add DebugCounter to your own passes to make life easier for debugging.
Adding DebugCounters isn’t often too difficult. For example, I have several patches to add DebugCounter into passes: D50092, D50033.
I have already built a bisection tool to help Android toolchain debug on our side. So I will say that the bisecting idea with OptBisect and DebugCounter helps us save time while debugging mis-compilations.
Feel free to post if you have other ideas on this and hope that this thread can help.
Thanks,
Zhizhou
Zhizhou Yang via llvm-dev <llvm-dev@lists.llvm.org> writes:
To bring it to the next level, DebugCounter provides features for me
to have an in-pass (transformation) level limit to tell which
transform in the pass exactly caused the error. When we set StopAfter
value for a DebugCounter, it will eventually stop there as a limit.
And in D50031 and rL337748, I added a method to print DebugCounter
info: the `-print-debug-counter` flag. With this, writing a
transformation level bisection script will be more straightforward.
I have already built a bisection tool to help Android toolchain debug
on our side. So I will say that the bisecting idea with OptBisect and
DebugCounter helps us save time while debugging mis-compilations.
There is already a DebugCounter bisect tool in utils/bisect-skip-count.
It is not documented, unfortunately. I had to figure it out by
inspection, but you use it by including "%(skip)d" and "$(count)d" in
the command you specify to bisect-skip-count. Then those values get
filled in and your command should respond to them appropriately. For
example:
bisect-skip-count bisect-command.sh "%(skip)d" "%(count)d" 2>&1 | tee bisect.out
bisect-command.sh presumably looks something like this:
#!/bin/bash
skip=$1
count=$2
opt --debug-counter=my-counter-skip=${skip},my-counter-count=${count}
...
I recently used bisect-skip-count in this way very successfully to track
down an aliasing bug deep in the machine scheduler. I'm working on
documenting bisect-skip-count so people know about it. I can add
comments to the script but I haven't looked at updating web page sources
yet. I was thinking of adding something to the existing opt-bisect
page. Guidance here would be helpful.
I agree that anyone who adds DebugCounters should propose those changes
on Phabricator. We can incrementally improve the debuggability of LLVM
with such a process.
-David
I read this tool and I believe this could do a perfect job as a general bisecting script using DebugCounter.
Just one small nit that I noticed that you are using (1<<32) as a default upper bound. I have a patch that
could print the total number of transformations in a single pass: https://reviews.llvm.org/D50031. I think it
may be helpful to let the user know exactly how many transformations are there in the pass, so that they
could have a high level idea of the position of bad transformation.