Statistics for 'switch' instruction.

Hi all!

I collected some statistics for SwitchInst, based on our test-suite.

I found that efficiency is meaningful for parser-like applications only, about 15-20% of cases may be reduced (see top 10 items in Raw sheet of attached .xls file).

Below some numbers.

Total number of cases in all test-suite -- 11835.
Clusters possible to make -- 503
Clusters with size (H - L) + 1 >= 8 -- 48
Cases stay single numbers -- 9794

% of all cases may be clustered -- 17.2%
% of all case comparisons may be reduced -- 4.6%
For example if 3 cases turned into cluster with size 3, it means that we reduce 1 comparison. If only 2 cases turned into cluster with size 2, we reduce nothing, since we still need two comparisons: L <= Value and H >= Value.

Average number of switches in application is 1.07.
Average number of cases is 6.4.

How I calculated comparisons-may-be-reduced number.

In spreadsheet file GE3 means cluster with size (H-L+1) greater or equal 3.
Based on number of clusters GE3, GE4, GE8, I calculated number of clusters with
   -- size >= 4 and < 8: CLUSTER4_7 = GE4-GE8.
   -- size == 3: CLUSTER3 = GE3-GE4
   -- size == 2: CLUSTER2 = AllClusters-GE3
Based on assumption that CLUSTER3 allows to reduce 1 comparison, CLUSTER4 allows to reduce 2 comparisons and so on, I calculate lower bound of number of comparisons possible to reduce:
CLUSTER_GE8 means number of clusters with size H-L+1 >= 8

Spreadsheet (.xls file in .tgz) with statistics data and patch with harness that collects data are in attachment.

How to generate statistics.
1. Apply patch in attachment to clang sources.
2. Go to CompilerInstance::ExecuteAction, setup SwitchStatisticsFileName value.
3. Compile clang.
4. Run test-suite.
5. After test-suite finished the statistics file is ready to use.

P.S.: Of course you may run *any* program with patched clang and look at statistics then.


switch-statistics.patch (17.1 KB)

Switch-Statistics.tgz (73.6 KB)

Hi Stepan,

These are quite interesting numbers. Thanks for running this experiment!
Do you have some numbers on the difference of memory consumption and compile time (at -O0 and -O2) with and without case ranges? I think we need to see those numbers before we can decide wether it's interesting to commit to case ranges or not.


Quoting Stepan Dyatkovskiy <>:


You cannot call removeModule on a JIT with no modules: jitstate will be 0 and therefore we have a null-pointer exception.
The function returns a boolean for success/failure, however, so you would expect to be able to call it and get false back.

Should we be checking for jitstate != 0 before accessing the variable?

- if (jitstate->getModule() == M) {
+ if (jitstate && jitstate->getModule() == M) {


Sure, seems reasonable.