Modularity of LLVM and gcc

Firstly, I don’t mean to start some tool war or what, just a discussion.

I’m a graduate student and I research about the compiler optimization algorithms.

I found that for some patterns of codes, the optimizer of LLVM including O1, O2, O3 will cause the performance worse comparing with O0.

So I make some improvements on the optimization and it works and improves the performance for sure.

Now, I’m surveying about “Related Work” for my thesis and I want to put gcc in it.

This getting performance worse issue doesn’t appear at gcc compiler so my thesis advisor asked me why don’t you just using the gcc compiler but LLVM for this pattern of code? What 's your point in your thesis? Why you must to do this change on LLVM instead of using gcc for good?

At first, my point is “the modularity of LLVM is better than gcc. If I handle this problem on LLVM IR, this problem will not appear for permanent. If there are new high-level language or machine with new ISA, the developers of them won’t need to consider this problem.
The problem doesn’t show on gcc for now. But the coupling of gcc is heavy, the information needs to be passes between the phases of gcc. If the new high-level language or machine are created, this problem needs to be consider when the developers develop frontends and backends.”

But after surveying gcc architecture these days, I learn about GENERIC in gcc. It’s a language-independent IR and every language can be convert to it to pass to the middle-end and backend. Maybe it’s not easier to develop a frontend on gcc than LLVM but it’s an interface to connect to middle-end including the optimizer in gcc. So the point of “It’s not necessaey to consider this problem when creating new high-level language after solving on LLVM IR” may be not true.

I think it’s worthy to handle the problem on LLVM, but I couldn’t found the reasons especially comparing with gcc.

The only reason I think for now is because of the license and the capability of LLVM’s retarget. More companies build their own machine on LLVM and that problem will not show up because I solved on LLVM IR. They don’t need to consider about that.

I don’t know this kind of issue can be posted in here or not. If there are some points or concepts that may be controversial and make you feel unconfortable, I apologize to you.

I keep it short;

It’s totally fine to use GCC if that works for your case better. I doubt anyone here will be offended :slight_smile:

GCC is often better, especially for CPU tasks. Clang is sometimes better, and pretty decent/versatile wrt. GPUs. Some of the differences are conceptual (e.g., missing kinds of optimizations/analyses) but some are somewhat of a tossup (e.g., different heuristics help case A but hurt case B).
Many people here appreciate the license of LLVM a lot, that’s certainly a big driver. Others are here because most research tools are nowadays done on top of LLVM so you can pick these up easier. Probably plenty of other reasons, though not all of them might be interesting to you.

At the end of the day it’s good that we have 2 competing open source compilers, I think.

(FTR, all of these are empirical findings, I have no numbers to back them up nor will I try to find those.)