A faster instruction selector?[MESSAGE NOT SCANNED]

Hi Nicolas and Dan,

Thanks for your replies.
I've been playing around with various setting, as you suggested.

> What version of LLVM are you using here?
I'm using 2.4

My original time ratios of reg-alloc to instruction selection (1:12) referred to the local register allocator and the standard instruction selector (all passes), not a sensible comparison, I realise.

> I did some benchmarks in vmkit (the project that encompass ladyvm) and
> compilation time is roughly 50% faster with both fast and local register
> allocator.

Choosing the fast selector does speed code-generation by almost double,
when using llc, but the reduction in final code speed is obviously a downside.
Since my toolkit generates an interpreter, I am able to just compile hotspots, so final speed of compiled code is quite important.

I did some runs of llc and lcc with valgrind, to get some less noisy timings and the relative times are:
lcc (lex, parse, etc ) 4
llc (normal settings) 18
llc (fast and local reg alloc) 11

Obviously the ratios will change depending on the input, but as you can see, there is still enormous potential for improvement.

> Do you have time to do more detailed profiling? It might be interesting
> to see which parts of codegen are hot in your use cases.

Unfortunately I could get any meaningful information from valgrind as to where llc is spending its time (Unknown code: 45%), but I suspect the relative slowness is caused by the number of intermediate data structures required and the time spent in creating them.

I notice in the docs http://llvm.org/docs/CodeGenerator.html#selectiondag_future that
"auto-generate entire selector from .td file" is a possible future
development. I guess that's what I hoping for :slight_smile:

I realise I'm asking a lot and not offering much, but I think a fast JIT compiler would really be an asset to llvm. After all, more and more software is being run on VMs rather than direct on the hardware.

Finally, it must be possible to select the register allocator for the JIT using the API, but I am unable to find out how to do this, any ideas?

Cheers,
Mark.

Choosing the fast selector does speed code-generation by almost double,
when using llc, but the reduction in final code speed is obviously a
downside.

[...]

Since my toolkit generates an interpreter, I am able to just compile
hotspots, so final speed of compiled code is quite important.

I recommend doing some measurements here rather than
guessing. I note that you've posted several numbers comparing
compile times with lcc, but no numbers comparing the quality of
the generated code yet :-).

Do you have time to do more detailed profiling? It might be interesting
to see which parts of codegen are hot in your use cases.

Unfortunately I could get any meaningful information from valgrind as to
where llc is spending its time (Unknown code: 45%), but I suspect the
relative slowness is caused by the number of intermediate data
structures required and the time spent in creating them.

On some hosts LLVM defaults to being built with -fomit-frame-pointer,
and this interferes with some performance tools, so that's something
worth checking.

This would also be a good time to sanity check that you're using an
optimized build of LLVM, which can make a tremendous difference.
FWIW, the default mode for LLVM Makefiles is non-optimized.

Finally, it must be possible to select the register allocator for the
JIT using the API, but I am unable to find out how to do this, any ideas?

See RegAllocRegistry.h and RegisterRegAlloc::setDefault.

Dan