Lowering switch statements with hashing

Hello Joerg!

>> I had also considered CHM. This is a method which can easily deal
>> with really huge label sets since its runtime is O(n), and produces
>> a quite small auxiliary table.
>> However CHM produces a hash function that needs 2 table accesses
>> whereas Jenkins' needs only one.
>> CHM: h(x) = (table[h1(x)]+table[h2(x)]) mod size
>> Jenkins: h(x) = h1(x) ^ table[h2(x)]
>> This means that code generated with Jenkins' method ought to be a
>> bit faster.

> For out-of-order CPUs is unlikely to make a difference.

The difference will probably be noticeable; a benchmark will prove it.
At least CHM needs more code. I am not sure about the needed table space.

>> I admit that Jenkins' method will probably take much longer to
>> evaluate the auxiliary table but since we have to deal with only
>> moderately large label sets (usually well below 1000 labels) this
>> should be acceptable. Our goal is not to speed up the compiler at
>> all means but to let it produce better code.

> The main problem is that the missing degree of freedom makes the
> algorithm not scale. Consider for a moment the problem of not finding
> a construction in reasonable time. Now imagine this happens because
> someone just aded one more label to the switch. The huge problem of
> gperf, Jenkin's and similar implementation is that they don't scale
> well beyond a few dozen items -- it might work or it might not.
> That's a serious limitation for a tool like a compiler,
> especially when dealing with a changing code base.

I tweaked Jenkins' algorithm a bit and got it much faster than the original. My experience is that it can now easily deal with several thousand labels in reasonable time (please try it).
If a solution is not found in a reasonable time the compiler will do what it had done before my patch was present: Split the label set in two halves and repeat the algorithm with these halves. Therefore we should not expect that the resulting code will become slower just because of the patch.
We should include a means to create statistics for switch statements in order to obtain a discussion base; a failure of achieving perfect by Jenkins' code could even be output as a hint or warning. If we are then faced with such warnings we can easily enhance the patch by including alternative algorithms such as CHM. We can also do it just to be sure and/or for the sake of being able to benchmark real life programs with different hashing schemes.

As for the size of label sets, in "A Superoptimizer Analysis of Multiway Branch Code Generation" by Roger Anthony Sayle (http://ols.fedoraproject.org/GCC/Reprints-2008/sayle-reprint.pdf) we can find the following:
"The largest difficult switch statement, found in the conversions.c source file of Debian’s msort-8.42 package, had 1,080 case ranges, covering 1,202 case values branching to 320 unique destination labels."
I tried exactly these switches and Jenkins' code could easily deal with them. For this large switch the following code was produced (labels renamed):
  movl %edi, %esi
  shll $15, %esi
  shrl $21, %esi
  movl %edi, %edx
  andl $127, %edx
  movzwl BTable(%rdx,%rdx), %edx
  xorl %esi, %edx
  cmpl ValTable(,%rdx,4), %edi
  jne DefaultCase
  jmpq JumpTable(,%rdx,8)

Best regards
Jasper