Hi Raul and Duncan,

Duncan,

Thank you for review. I hope to present fixed patch tomorrow.

First, I would like to show few performance results:

command: "time opt -mergefunc <test>"

File: tramp3d-v4.ll, 12963 functions

Current : Patched

Time: 9.8 s : 2.0 secs ; >400%!!

Functions merged: 3503 : 3507

File: spirit.ll, 2885 functions

Current : Patched

Time: 2.2 s : 0.5 s

Functions merged: 1503 : 1505

File: k.ll, 2788 functions

Current : Patched

Time: 1.5 s : 0.7 s

Functions merged: 1626 : 1626

File: sqlite3.ll, 1057 functions

Current : Patched

Time: 0.4 s : 0.4 s

Functions merged: 7 : 7

All the tests were token from test-suite object directory. In average it shows >200% performance improvement. You could also see that patched version detects a bit more functions for merging, since it has improved constant-equivalence detection.

Raul,

Yes. you right it is O(N*log(N)). And yes, I have some ideas about hashing.

Though in the end of hashing improvement I still see the tree. I'll explain.

Once you calculated hash (even really good one), you still have probability of N>1 functions with the same hash. Of course it could be really low. But then, you have to spend some time just being creating such a complex hash (definitely, you have to scan whole function body).

I think there should be sequence of short hash numbers. In another words we can try to represent function as a chain of hash codes:

F0: hashF0-0, hashF0-1, hashF0-2, ...

F1: hashF1-0, hashF1-1, ...

In this case you can create kind of hash-tree. Consider we have F0, F1 and F2.

Imagine, hashF0-0 == hashF1-0 == hashF2-0, hashF0-1 == hashF1-1:

Then we can build the next tree (requires fixed-width fonts):

[hashF0-0]

/ \

[hashF0-1] [hashF2-1]

/ \ \

[hashF0-2] [hashF1-2] [hashF2-2]

In this case as you can see, whole process would be of complexity O(N*n), where:

N - number of functions

n - function size.

Note, hashF[i]-[j] could be generated on demand. It is not obvious to generate whole hash at once.

I have also implemented this approach. Though it is looks more complex, and not so fast, as I hoped. Perhaps I can make it simplier. But it could not be simpler that just to replace "bool" with -1,0,1. Actually I really wanted to propose idea with hash-trees, since it combines advantages of hashing and tree-based lookup routines. But then I discovered idea I presented finally, and saw that it shows almost same improvement and looks much simpler.

-Stepan

Duncan P. N. Exon Smith wrote: