Hi Raul and Duncan,
Duncan,
Thank you for review. I hope to present fixed patch tomorrow.
First, I would like to show few performance results:
command: "time opt -mergefunc <test>"
File: tramp3d-v4.ll, 12963 functions
Current : Patched
Time: 9.8 s : 2.0 secs ; >400%!!
Functions merged: 3503 : 3507
File: spirit.ll, 2885 functions
Current : Patched
Time: 2.2 s : 0.5 s
Functions merged: 1503 : 1505
File: k.ll, 2788 functions
Current : Patched
Time: 1.5 s : 0.7 s
Functions merged: 1626 : 1626
File: sqlite3.ll, 1057 functions
Current : Patched
Time: 0.4 s : 0.4 s
Functions merged: 7 : 7
All the tests were token from test-suite object directory. In average it shows >200% performance improvement. You could also see that patched version detects a bit more functions for merging, since it has improved constant-equivalence detection.
Raul,
Yes. you right it is O(N*log(N)). And yes, I have some ideas about hashing.
Though in the end of hashing improvement I still see the tree. I'll explain.
Once you calculated hash (even really good one), you still have probability of N>1 functions with the same hash. Of course it could be really low. But then, you have to spend some time just being creating such a complex hash (definitely, you have to scan whole function body).
I think there should be sequence of short hash numbers. In another words we can try to represent function as a chain of hash codes:
F0: hashF0-0, hashF0-1, hashF0-2, ...
F1: hashF1-0, hashF1-1, ...
In this case you can create kind of hash-tree. Consider we have F0, F1 and F2.
Imagine, hashF0-0 == hashF1-0 == hashF2-0, hashF0-1 == hashF1-1:
Then we can build the next tree (requires fixed-width fonts):
[hashF0-0]
/ \
[hashF0-1] [hashF2-1]
/ \ \
[hashF0-2] [hashF1-2] [hashF2-2]
In this case as you can see, whole process would be of complexity O(N*n), where:
N - number of functions
n - function size.
Note, hashF[i]-[j] could be generated on demand. It is not obvious to generate whole hash at once.
I have also implemented this approach. Though it is looks more complex, and not so fast, as I hoped. Perhaps I can make it simplier. But it could not be simpler that just to replace "bool" with -1,0,1. Actually I really wanted to propose idea with hash-trees, since it combines advantages of hashing and tree-based lookup routines. But then I discovered idea I presented finally, and saw that it shows almost same improvement and looks much simpler.
-Stepan
Duncan P. N. Exon Smith wrote: