[PATCH] increase the max number of physical registers

Hello,

Attached is a trivial patch to increase the max number of physical
registers in LLVM from 1024 to 16384.

In our TCE toolset we allow the designer to choose the number of registers in the designed TTA processors freely, and recently
while experimenting with using TTA for a GPU design we have
bumped into this limit several times.

What has made matters a bit worse for us is that we need to
reserve two reg indices for each register so they can be used
both for integer and floating point computation, thus the
current limit is 512 registers and the patch lifts it to 8K
which should be enough for a while.

The patch applies cleanly to LLVM 2.5, 2.6 and trunk.
'make check' passes in trunk with this patch.

Best regards,

llvm-more-phys-regs.patch (749 Bytes)

Hello, Pekka

Attached is a trivial patch to increase the max number of physical
registers in LLVM from 1024 to 16384.

Just random thought - could this be a configure-time option?

Anton Korobeynikov wrote:

Attached is a trivial patch to increase the max number of physical
registers in LLVM from 1024 to 16384.

Just random thought - could this be a configure-time option?

I'd say that if there is no strong reason (I have no clue if
there is) to keep the number lower, I think the configure option
just adds clutter to the configure script.

This is fine to me in principle, but please make sure this doesn't impact compile time or memory usage of llc somehow.

-Chris

Hi,

Chris Lattner wrote:

This is fine to me in principle, but please make sure this doesn't
impact compile time or memory usage of llc somehow.

OK. Any recommended way to do this? Is there some nice way to benchmark
speed + memory consumption of llc in LLVM testing infra at the
moment or should I just use 'top' to the inspect memory consumption
and 'time' for speed measurements with a code generation from
a big bitcode?

Please compare a release-asserts (make ENABLE_OPTIMIZED=1 DISABLE_ASSERTIONS=1) builds with and without your patch. Just run 'llc' on a collection of large bc files (e.g. kimwitu++ from the testsuite, some SPEC2K6 programs if you have access to it, etc) and compare them.

Thanks!

-Chris

Chris Lattner wrote:

Please compare a release-asserts (make ENABLE_OPTIMIZED=1
DISABLE_ASSERTIONS=1) builds with and without your patch. Just run
'llc' on a collection of large bc files (e.g. kimwitu++ from the
testsuite, some SPEC2K6 programs if you have access to it, etc) and
compare them.

OK. I compared the speed + memory consumption for a llc of
a 8M bitcode (AES encryption of 64 blocks, fully unrolled to
a single BB). No difference. Both had about 700M peak virt
memory consumption and it took about 45mins for 10 rounds.

More elaborate results:

Executed:

time for i in $(seq 10); do echo $i; llc aes.bc -f -o aes.o; done;

Peak memory consumption monitored with

while true; do pmap $(pgrep llc) | tail -1; sleep 1; done;

The smallest and largest max values (of the 10 runs)
were picked from the output.

Without the patch:

mem 690960K ... 715476K
real 46m53.966s
user 46m32.299s

With the patch:

mem 691096K ... 715608K
real 46m10.101s
user 45m56.944s

Hello,

Can someone please commit this patch?

Thanks.

Here's the actual patch, sorry :wink:

llvm-more-phys-regs.patch (749 Bytes)

Applied in r90789.

Dan

This caused a massive slow down to post-ra scheduler (llc -O3 on x86, -O2 on ARM). I'm going to revert it for now until it has been addressed.

Evan

Probably caused by this member:

    /// KillIndices - The index of the most recent kill (proceding bottom-up),
    /// or ~0u if the register is not live.
    unsigned KillIndices[TargetRegisterInfo::FirstVirtualRegister];

And this:

  std::fill(KillIndices, array_endof(KillIndices), ~0u);

It should probably be dynamically allocated with TRI->getNumRegs() members instead.

/jakob

This caused a massive slow down to post-ra scheduler (llc -O3 on x86, -O2 on ARM). I'm going to revert it for now until it has been addressed.

Probably caused by this member:

   /// KillIndices - The index of the most recent kill (proceding bottom-up),
   /// or ~0u if the register is not live.
   unsigned KillIndices[TargetRegisterInfo::FirstVirtualRegister];

And this:

std::fill(KillIndices, array_endof(KillIndices), ~0u);

It should probably be dynamically allocated with TRI->getNumRegs() members instead.

Yep. David Goodwin is going to fix it.

Evan

I have the fix ready, and I'll check it in tomorrow morning.

David

Sending lib/CodeGen/AggressiveAntiDepBreaker.cpp
Sending lib/CodeGen/AggressiveAntiDepBreaker.h
Sending lib/CodeGen/CriticalAntiDepBreaker.cpp
Sending lib/CodeGen/PostRASchedulerList.cpp
Transmitting file data ....
Committed revision 90970.

Hello,

This patch was reverted and the revert was forgotten to be undone
after the performance regression it introduced was fixed.

Can someone please revert it back (i.e. increase the max physreg size
to 16K or even better to 32K) to enable us to experiment with large
register number machines again? :slight_smile:

It was this trivial patch:

Index: include/llvm/Target/TargetRegisterInfo.h

Hello,

I haven't yet received any input what this patch breaks (some
LLVM-external project?) and the branch creation date is tomorrow.

Can someone please tell what we can do to make this patch merged
in for 2.8?

Thanks!

hi Jääskeläinen,

i am very interesting in the way you write the RegisterInfo.td for the
machine with such a lot registers, or it is generated by some others
program?

best regards
ether

Hi,

I'll get it in and tested as soon as I can. It broke jitting last time, and
the builders have been in an awful state of flux every time I've wanted
to put it in lately so I can retest.

-eric