Hint about how to contribute to LLVM

Dear all,

If possible, I would like to contribute to LLVM. First of all, I would
like to say that I'm a newbie in LLVM, and my experience in compilers
implementation is only in simples and academics "toy projects".

My main interesting are focused in try to improve the code generated
by LLVM, where "improve" I mean "better performance". I'm interested,
for example, in machine-independent optimizations but, reading more
about LLVM, it's seems to be complete.

If possible, I would like some suggestions about current possibilities
and, more that, "additional reading" and a "current status" if someone
are currently working in this.

Looking at http://llvm.org/OpenProjects.html, one point seems interesting:

"Miscellaneous Improvements. Move more optimizations out of the
-instcombine pass and into InstructionSimplify. The optimizations that
should be moved are those that do not create new instructions, for
example turning sub i32 %x, 0 into %x. Many passes use
InstructionSimplify to clean up code as they go, so making it smarter
can result in improvements all over the place.".

For a newbie, what is the complexity of this task? Someone would
suggest other tasks?

Thanks in advance for your attention.

Alex Garzao <alexgarzaol@gmail.com> writes:

Looking at http://llvm.org/OpenProjects.html, one point seems interesting:

"Miscellaneous Improvements. Move more optimizations out of the
-instcombine pass and into InstructionSimplify. The optimizations that
should be moved are those that do not create new instructions, for
example turning sub i32 %x, 0 into %x. Many passes use
InstructionSimplify to clean up code as they go, so making it smarter
can result in improvements all over the place.".

For a newbie, what is the complexity of this task? Someone would
suggest other tasks?

I think this is a pretty good place to start. The transformations are
simple and it should be fairly mechanical. It should be a good
introduction to the target-independent optimizer.

It would also help a lot!

                               -Dave

Alex Garzao wrote:

Dear all,

If possible, I would like to contribute to LLVM. First of all, I would
like to say that I'm a newbie in LLVM, and my experience in compilers
implementation is only in simples and academics "toy projects".

My main interesting are focused in try to improve the code generated
by LLVM, where "improve" I mean "better performance". I'm interested,
for example, in machine-independent optimizations but, reading more
about LLVM, it's seems to be complete.

Far from it! Take a look inside lib/Target/README.txt. It's full of entries like this:

[LOOP DELETION]

We don't delete this output free loop, because trip count analysis doesn't
realize that it is finite (if it were infinite, it would be undefined). Not
having this blocks Loop Idiom from matching strlen and friends.

void foo(char *C) {
  int x = 0;
  while (*C)
    ++x,++C;
}

A number of them are hard to approach (if they were easy, someone would have done them -- try reading the file from the bottom, as new entries get appended), but they're all good places to start poking into the optimizer. As a beginner, I'd avoid ones that have caveats like "but to do this, we need to change the codegen to legalize it back" (ie., undo the optimization if the target doesn't have the appropriate instruction).

Another great source of things that I find is just playing with the STL, writing:

   void foo() {
     vector<int> v;
   }

and verifying that this actually gets deleted. Trying different base types instead of vector, maybe calling methods on it (ie., add "v.push_back(5);" to this example, and now the code doesn't get deleted!).

Don't worry about performance impact of your changes yet. Once you're comfortable analyzing LLVM IR and adding optimizations (or learning when to move on), you'll start seeing missed optz'ns in .ll files everywhere. You don't necessarily want to start at a profiler to decide which optimizations to tackle; a memory access saved here may avoid calling a slow function entirely, or call it half as often, etc. I generally consider optimizations that apply to loads/store instructions to be the most important, then optimizations on loop structure, then libcalls, float math (hard!, but lots of low-hanging fruit), integer divide/remainder, and finally everything else.

Nick

If you need hints what you could optimize, ask me. I have a whole bunch of cases where LLVM does not generate the best code.

Have you filed bugs?

-eric

Not yet. But i asked on the mailing list and I asked around in the IRC channel.

2011/10/13 Eric Christopher <echristo@apple.com>

We're very interested in small testcases where LLVM produces suboptimal code. Please post your list here or file bugs.

- Ben

Reported as #11142

2011/10/15 Benjamin Kramer <benny.kra@googlemail.com>