Quirk in switch lowering

Hello all,

in SelectionDAGBuilder::handleBTSplitSwitchCase (lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp) seems to be a quirk which can effectively create an else-if chain from a very sparse switch statement instead of a balanced binary tree.

A possible solution is to weight an isolated single switch label with 0. To do this one can replace the assignments to LDensity und RDensity by:
     volatile double LDensity =
        (double)(LSize-1).roundToDouble() /
                            (LEnd - First + 1ULL).roundToDouble();
     volatile double RDensity =
       (double)(RSize-1).roundToDouble() /
                            (Last - RBegin + 1ULL).roundToDouble();

I replaced ?Size by (?Size-1).

Here is a more or less stupid test program (translate with "clang -O3 -S test.cpp" and then look at test.s):

int main(int argc, char **argv) {
   switch (argc) {
     case 100: return 1;
     case 200: return 2;
     case 300: return 3;
     case 400: return 4;
     case 500: return 5;
     case 600: return 6;
     case 700: return 7;
     case 800: return 8;
     case 900: return 9;
     default: return 0;
     }
   }

Best regards
Jasper Neumann

Would you mind reporting a bug in llvm.org/bugs?

Thanks

Hello Rafael, hello all!

> Would you mind reporting a bug in llvm.org/bugs?

I have already done so, see the following entries:
http://llvm.org/bugs/show_bug.cgi?id=18347
http://llvm.org/bugs/show_bug.cgi?id=18348
http://llvm.org/bugs/show_bug.cgi?id=18349

Best regards
Jasper Neumann

Some of the behavior seen there was for a purpose, like the heuristics
to overcome different problems. I have to remember all the details :slight_smile:

Hello Anton!

> Some of the behavior seen there was for a purpose,
> like the heuristics to overcome different problems.
> I have to remember all the details :slight_smile:

I have read your article "Improving Switch Lowering for The LLVM Compiler System" and thought that the selection mechanism works out well; by chance however I found that this mechanism does not work as designed if labels are sufficiently sparse and can effectively create an else-if chain because separated (single) marginal (i.e. first and last) labels have a too high weight.
The central point might be: What is the density of a single label?
It might also be that the logarithmic metric is not always working well.

My simple modification however seems to be a solution to this special problem; until now I have not found any malicious secondary effects (collateral damages).

By the way: I have submitted the patch to llvm-commits@cs.uiuc.edu few minutes ago; perhaps you are the person with the best knowledge for this problem.

I'm currently working on switch lowering using hashing which can fully replace the jump table method. It works quite good by now but I will have to prepare a ton of documentation and test stuff. Some of the remaining problems can easily be solved later.
Is anybody interested in beta-testing my patches therefor?

Best regards
Jasper Neumann

I'm currently working on switch lowering using hashing which can fully
replace the jump table method. It works quite good by now but I will have to
prepare a ton of documentation and test stuff. Some of the remaining
problems can easily be solved later.

Is it somehow similar to MRST (multi-way radix search trees) technique?

I’d be interested in giving it a shot if you’d like some performance testing done.

-eric

Hello Anton,

I'm currently working on switch lowering using hashing which can fully
replace the jump table method. It works quite good by now but I will have to
prepare a ton of documentation and test stuff. Some of the remaining
problems can easily be solved later.

Is it somehow similar to MRST (multi-way radix search trees) technique?

Well, yes and no. The idea is to catch all switch labels with a single perfect hash function which transforms the label values into a small range of numbers. One test follows and thereafter we can dispatch with a jump table. This is especially effective if the switch labels are sparse. The normal jump table approach is a special case thereof.
The MRST technique can be seen as an imperfect hashing approach.
Larger ranges ought to be treated later by MRST or a decision tree; unfortunately I could not yet get this running.
A detailed discussion can be found in http://ols.fedoraproject.org/GCC/Reprints-2008/sayle-reprint.pdf ("A Superoptimizer Analysis of Multiway Branch Code Generation" by Roger Anthony Sayle, 2008).

Best regards
Jasper Neumann