Recently, I read two papers ,  about finding the root causes of compiler bugs. However, I do not find any information in these paper about how compiler developers find the root causes of compiler bugs in practice. So I am curious whether these techniques are useful in practice. For my experience, the outputs of compilers are always used to isolate the causes of compiler bugs, such as the IR after each pass or the backtrace.
I am a newbie for LLVM. So I am curious how developers of LLVM or GCC find the root causes of compiler bugs in practice.
 Junjie Chen, Jiaqi Han, Peiyi Sun, Lingming Zhang, Dan Hao, and Lu Zhang. 2019. Compiler bug isolation via effective witness test program generation. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2019). Association for Computing Machinery, New York, NY, USA, 223–234. DOI:https://doi.org/10.1145/3338906.3338957
 Junjie Chen, Haoyang Ma, Lingming Zhang, Enhanced Compiler Bug Isolation via Memoized Search, Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering.
Recently, I read two papers ,  about finding the root causes
of compiler bugs. However, I do not find any information in these
paper about how compiler developers find the root causes of compiler
bugs in practice. So I am curious whether these techniques are
useful in practice. For my experience, the outputs of compilers are
always used to isolate the causes of compiler bugs, such as the IR
after each pass or the backtrace.
I'm not sure I understand what you're asking. Once you know what a
compiler is doing wrong, you find the "root cause" of that going wrong
in the same way as you'd debug any other program. Or are you asking
how compiler developers determine what's causing a compiler issue?
In general, finding the root cause in LLVM is not really a big difference than debugging a normal software: Depending on the scenario, if it’s a crash then putting it on a gdb is probably the first step you wanna do. And this usually can tell you the answer pretty fast.
More tricky scenarios usually involving developers to leverage various of LLVM-specific diagnosing features, the
-print-after-all CLI option in opt, to name a few, to provide more insights on the intermediate steps. To help you narrowing down the problematic region. If the input is too big, more advanced tool like bugpoint (this is also LLVM-specific tool) can help you bisecting and trimming the input. After these (pre)processing, normal debugging tricks like gdb or even the good-old-printf can be easily applied
IMHO the efficiency of finding root cause heavily depends on your experiences on engineering. And I 100% agree that it sometimes takes a lot of time. So it kinda makes sense that people want to automate it, but I’m not an expert on this matter. All I know is there has been tons of research and efforts on finding bugs - there might be some overlap on these two topics, i’m not really sure. But you might want to checkout techniques like fuzzing, and sanitizer. LLVM has pretty mature implementations on both of them.
I skimmed the first paper. I’m assuming they picked mis-compile bugs (bad code generation). It looks like a technique to triangulate on the likely buggy module by mutating the example code provided with a bug report, and comparing coverage traces for “good” and “bad” sources. It’s very black-box, assuming no diagnostic aids from the software under test, but that’s reasonable for an automated technique.
In my experience, many bugs have fairly obvious origins (at least in terms of likely source modules) to someone experienced in a given area. But I could see this being a useful tool for bugs with less obvious origins, and certainly a lot less tedious than wading through lots of diagnostic output. Also could be useful to people less familiar with LLVM, as one way to narrow down the search for a bug without needing to learn a lot about LLVM’s own diagnostic tools.
FTR, the paper says the benchmark and code are available at the project webpage: https://github.com/JunjieCheck/DiWi if anyone wants to try it out.