MergeFunctions: reduce complexity to O(log(N))

Hi all,

I propose simple improvement for MergeFunctions pass, that reduced its complexity from O(N^2) to O(log(N)), where N is number of functions in module.

The idea, is to replace the result of comparison from "bool" to "-1,0,1". In another words: define order relation on functions set.
To be sure, that functions could be comparable that way, we have to prove that order relation is possible on all comparison stage.

The last one is possible, if for each comparison stage we implement has next properties:
* reflexivity (a <= a, a == a, a >= a),
* antisymmetry (if a <= b and b <= a then a == b),
* transitivity (a <= b and b <= c, then a <= c)
* asymmetry (if a < b, then a > b or a == b).

Once we have defined order relation we can store all the functions in binary tree and perform lookup in O(log(N)) time.

This post has two attachments:
1. The patch, that has implementation of this idea.
2. The MergeFunctions pass detailed description, with explanation how order relation could be possible.

Hope it helps to make things better!

-Stepan.

MergeFunctions.doc.tar.gz (20 KB)

mergefunctions-less-comparison-2014-01-17.patch.tar.gz (10 KB)

ping
Stepan Dyatkovskiy wrote:

Hi, Stepan…

Each insertion in the tree will be O(log(N)), which would make the whole process O(N log(N)), right?
It is still an improvement over an O(N^2) all-to-all check, of course…

Have you explored alternatives by improving the hashing or doing some secondary hashing? That might have an even bigger impact, potentially making it O(N) on cases with perfect-hashing.

Hi Stepan,

This looks interesting! Some high-level comments:

- Please post the patch untarred next time. Also, I'm not sure if it's typical
  to provide supporting documents in .doc format; while many of us probably
  have access to Word, a more portable and email-friendly format (like text,
  markdown or rst) is probably better.
- Have you profiled this? What's the speedup? I second Raul's question about
  whether improving the hashing (or doing secondary hashing) could result in a
  larger benefit (maybe even with simpler code).
- At the beginning of FunctionComparator::cmpEnumerate, you have a comment that
  you can't introduce an order relation for cross-references, and the following
  code:

        // Unfortunately we can't introduce order relation for
        // cross reference case. It is possible for self-reference only.
        if (V1 == F1) {
          if (V2 == F2)
            return 0;
          return -1;
        }
        if (V2 == F2) {
          if (V1 == F1)
            return 0;
          return 1;
        }

    Can't this break asymmetry? Doesn't breaking the ordering relation cause a
    correctness problem? I.e., what am I missing here?

I also have a few low-level comments (and nitpicks):

- The comments at the top of the file don't match the new implementation!
- Lots of strange tabbing after your changes. E.g.:

        + int cmpGEP(const GetElementPtrInst *GEP1,
                                const GetElementPtrInst *GEP2) {

- After the FunctionPtr class declaration, you have an extra blank line. There
  are a few of these scattered through (e.g., at the beginning of
  FunctionComparator::cmpConstants).
- Your new helper functions (such as cmpNumbers) that are local to the file
  should be declared 'static'.
- You use the following pattern throughout:

        int Res;
        // ...
        Res = cmpNumbers(I1->getNumOperands(), I2->getNumOperands());
        if (Res != 0)
          return Res;
      
    whereas, since you never use Res unless it's non-zero, I think the
    following is more clear and concise:

        if (int Res = cmpNumbers(I1->getNumOperands(), I2->getNumOperands()))
          return Res;

- I'm not sure if your MERGEFUNC_SANITY should be included the way it is:

        +//#define MERGEFUNC_SANITY 900
        +#ifdef MERGEFUNC_SANITY

- I’m also unsure about the following:

        + // If we have some problematic functions, perhaps we would like
        + // to keep declarations only, and those definitions fires the problem.
        + // We could insert manualy last ones (notepad, nano, vi, whatever else :slight_smile: )
        +#if 0
        + for (Module::iterator I = M.begin(), E = M.end(); I != E; ++I) {
        + Function *F = I;
        + F->deleteBody();
        + }
        +#endif

Cheers,
Duncan

ping
Stepan Dyatkovskiy wrote:

Hi all,

I propose simple improvement for MergeFunctions pass, that reduced its
complexity from O(N^2) to O(log(N)), where N is number of functions in
module.

The idea, is to replace the result of comparison from "bool" to
"-1,0,1". In another words: define order relation on functions set.
To be sure, that functions could be comparable that way, we have to
prove that order relation is possible on all comparison stage.

The last one is possible, if for each comparison stage we implement has
next properties:
* reflexivity (a <= a, a == a, a >= a),
* antisymmetry (if a <= b and b <= a then a == b),
* transitivity (a <= b and b <= c, then a <= c)
* asymmetry (if a < b, then a > b or a == b).

For asymmetry, I think it’s: if a < b, then not a > b and not a == b.

Hi Raul and Duncan,

Duncan,
Thank you for review. I hope to present fixed patch tomorrow.

First, I would like to show few performance results:

command: "time opt -mergefunc <test>"

File: tramp3d-v4.ll, 12963 functions
                   Current : Patched
Time: 9.8 s : 2.0 secs ; >400%!!
Functions merged: 3503 : 3507

File: spirit.ll, 2885 functions
                   Current : Patched
Time: 2.2 s : 0.5 s
Functions merged: 1503 : 1505

File: k.ll, 2788 functions
                   Current : Patched
Time: 1.5 s : 0.7 s
Functions merged: 1626 : 1626

File: sqlite3.ll, 1057 functions
                   Current : Patched
Time: 0.4 s : 0.4 s
Functions merged: 7 : 7

All the tests were token from test-suite object directory. In average it shows >200% performance improvement. You could also see that patched version detects a bit more functions for merging, since it has improved constant-equivalence detection.

Raul,

Yes. you right it is O(N*log(N)). And yes, I have some ideas about hashing.
Though in the end of hashing improvement I still see the tree. I'll explain.

Once you calculated hash (even really good one), you still have probability of N>1 functions with the same hash. Of course it could be really low. But then, you have to spend some time just being creating such a complex hash (definitely, you have to scan whole function body).

I think there should be sequence of short hash numbers. In another words we can try to represent function as a chain of hash codes:
F0: hashF0-0, hashF0-1, hashF0-2, ...
F1: hashF1-0, hashF1-1, ...

In this case you can create kind of hash-tree. Consider we have F0, F1 and F2.

Imagine, hashF0-0 == hashF1-0 == hashF2-0, hashF0-1 == hashF1-1:

Then we can build the next tree (requires fixed-width fonts):

              [hashF0-0]
               / \
      [hashF0-1] [hashF2-1]
       / \ \
[hashF0-2] [hashF1-2] [hashF2-2]

In this case as you can see, whole process would be of complexity O(N*n), where:

N - number of functions
n - function size.

Note, hashF[i]-[j] could be generated on demand. It is not obvious to generate whole hash at once.

I have also implemented this approach. Though it is looks more complex, and not so fast, as I hoped. Perhaps I can make it simplier. But it could not be simpler that just to replace "bool" with -1,0,1. Actually I really wanted to propose idea with hash-trees, since it combines advantages of hashing and tree-based lookup routines. But then I discovered idea I presented finally, and saw that it shows almost same improvement and looks much simpler.

-Stepan

Duncan P. N. Exon Smith wrote:

Hi Raul and Duncan,

Duncan,
Thank you for review. I hope to present fixed patch tomorrow.

First, I would like to show few performance results:

command: "time opt -mergefunc <test>"

File: tramp3d-v4.ll, 12963 functions
                 Current : Patched
Time: 9.8 s : 2.0 secs ; >400%!!
Functions merged: 3503 : 3507

File: spirit.ll, 2885 functions
                 Current : Patched
Time: 2.2 s : 0.5 s
Functions merged: 1503 : 1505

File: k.ll, 2788 functions
                 Current : Patched
Time: 1.5 s : 0.7 s
Functions merged: 1626 : 1626

File: sqlite3.ll, 1057 functions
                 Current : Patched
Time: 0.4 s : 0.4 s
Functions merged: 7 : 7

All the tests were token from test-suite object directory. In average it shows >200% performance improvement.

These results look promising! When you say 200% average, do you mean
200% average for all files in test-suite, or just these four? Are
there any negative results?

You could also see that patched version detects a bit more functions for merging, since it has improved constant-equivalence detection.

1. I’d prefer to see the improvement to constant-equivalence detection
   as a separate initial patch (to the old code). Is that possible?
   In particular, it would be reassuring to see your patch *not* change
   the number of functions merged.

2. I’m also still concerned about correctness. Previously, when the
   comparison result was unknown, returning “false” would conservatively
   consider the functions distinct, so we had no correctness issues.

   However, with your patch, *everything* must be accurately compared,
   right? Are there any cases where you can’t accurately order
   functions? (Is FunctionComparator::cmpEnumerate an example?)

   If these are problems, maybe you can use the previous ==-style
   comparison to confirm.

Raul,

Yes. you right it is O(N*log(N)). And yes, I have some ideas about hashing.
Though in the end of hashing improvement I still see the tree. I'll explain.

Once you calculated hash (even really good one), you still have probability of N>1 functions with the same hash. Of course it could be really low. But then, you have to spend some time just being creating such a complex hash (definitely, you have to scan whole function body).

I think there should be sequence of short hash numbers. In another words we can try to represent function as a chain of hash codes:
F0: hashF0-0, hashF0-1, hashF0-2, ...
F1: hashF1-0, hashF1-1, ...

In this case you can create kind of hash-tree. Consider we have F0, F1 and F2.

Imagine, hashF0-0 == hashF1-0 == hashF2-0, hashF0-1 == hashF1-1:

Then we can build the next tree (requires fixed-width fonts):

            [hashF0-0]
             / \
    [hashF0-1] [hashF2-1]
     / \ \
[hashF0-2] [hashF1-2] [hashF2-2]

In this case as you can see, whole process would be of complexity O(N*n), where:

N - number of functions
n - function size.

Note, hashF[i]-[j] could be generated on demand. It is not obvious to generate whole hash at once.

I have also implemented this approach. Though it is looks more complex, and not so fast, as I hoped. Perhaps I can make it simplier. But it could not be simpler that just to replace "bool" with -1,0,1. Actually I really wanted to propose idea with hash-trees, since it combines advantages of hashing and tree-based lookup routines. But then I discovered idea I presented finally, and saw that it shows almost same improvement and looks much simpler.

-Stepan

I think you get diminishing returns for each level of the tree. I’d be
interested in seeing numbers on a simpler 2-stage hash.

- Leave the first hash as it is now in the code.
- Add a secondary hash that’s built from the quantities that the equals
  comparison (or your new less-than comparison) is based on.
- Fall back on O(n^2) for the third stage when there are collisions.

Hi Stepan,

As you've seen we have recently implemented a significant enhancement to the MergeFunctions pass that also allows merging of functions that are only similar but not identical
(http://llvm-reviews.chandlerc.com/D2591).

Our patch also changes the way that functions are compared quite significantly. This is necessary because we need to compare all functions in a bucket in order to get a similarity measure for each pair, so we can then decide which 'groups' of functions to merge.

I can't really see a way to combine our approach with your patch. What are your thoughts?

Another way to improve the performance of MergeFunctions might be to make the hash function better. I've already added the size of the first BB to it, and perhaps there are other factors we could try... if we don't have to compare functions in the first place (because they're in different buckets) then that's obviously a much bigger win.

Thanks,
Tobias

Hi Tobias,

I can't really see a way to combine our approach with your patch. What
are your thoughts?

I think it is possible. Even more - we need to combine our efforts, in order to bring this pass into real live.
I'have read your presentation file, and unfortunately read your patch only a little.
How exactly you scan functions on 2nd stage? Could you explain the algorithm in few words, how you compare workflow? Is it possible to scan binary tree instead of hash table? ...OK.

That's how I see the modification. Now its important to explain idea, so consider the simplest case: you need to catch two functions that are differs with single instruction somewhere.

1. Imagine, that IR *module* contents is represented as binary tree:
Each line (trace) from root to terminal node is a function.
Each node - is function's primitive (instruction opcode, for example).
Down - is direction to terminal nodes, up - is direction to the root.

2. Now you are standing on some node. And you have two directions down-left and down-right.

3. If you are able to find two equal sub-traces down (at left and at right), then the only difference lies in this node. Then we catch that case.

4. Even more, if two subtrees at left and at right are equal, than you catch at once all the cases that are represented by these subtrees.

I still didn't look at you patch carefully. Sorry.. But I hope that helps, and I'll try look at it in nearest time and perhaps its not the best solution I gave in this post.

-Stepan

Hi Duncan,

These results look promising! When you say 200% average, do you mean
200% average for all files in test-suite, or just these four? Are
there any negative results?

OK, now its clear we need really full statistics. I have one, but for old revision.
In general, this patch only updates result type, and it really don't do some extra calculations (except the constants, but below I'll explain, why its necessary).

Few words about accuracy and correctness.
When old algorithm says "these two parts differs", new one says only a bit more: which part is less. Two things outcomes from that approach:

1. Each time when we determine difference in old algorithm, we also determine difference in new one. If less comparison implemented properly, that also means that result would be 100% the same.
So the correctness is provided by correctness of "less" comparison: all order-relation properties should be implemented and proven.

2. The changes are so small (only few "if" statements for each comparison method), so its almost impossible to get it working slower than old one. Even in worst case, when difference lies in the very end of functions, we spent almost the same cpu time. Just as example:
- if (Ty1->getTypeID() != Ty2->getTypeID())
- return false;
+ if (int Res = cmpNumbers(Ty1->getTypeID(), Ty2->getTypeID()))
+ return Res;

So, when there is nothing to merge, we got the same performance results, like for old approach.
When there are some equivalent functions that could be merged, new approach gives performance improvement.

Now about constants.
I'll prepare separate patch. In comparison with old approach, it compares constant contents. It is replacement of bitcast possibility check. And yes, its the slowest place of new approach. Here we need to understand which constant is actually greater, and which one is less. Though it could be implemented like that:

if (C1 != ConstantExpr::getBitCast(const_cast<Constant*>(C2), C1->getType()))
  return cmpConstantContents(C1, C2);
// to to next checks

The only thing to be reassured here: whether getBitCast is transitive.

About hash.
Thats reasonable I'll try it.

Full statistics is coming soon..

-Stepan

Thank you Stepan for your comments.

While I agree that a better hash would be more expensive to compute, it won’t be a factor as long as it is linear on the size of the function, even if you have to walk the whole routine. After all, the comparison may need to walk over the whole routine as well (and do so n log(n) times, instead of just n times).

Where the comparison approach has an edge is that it can bail out on to the first difference, while the hashing needs to be fully computed all the time. This may be a noticeable diference in practice. That is why a multiple-hash approach may be interesting to explore. Use a very fast hash first, and then compute a more expensive hash on the second level as we go down into each bucket. Even multiple levels where we slowly look further down into the function may be appropriate – eg look at the first block on the first hash, then the next three blocks on the second hash, etc…

Also, some heuristics related to code size may be relevant here – as the size of the function increases the chances of finding a match likely go down very quickly.

Hi Duncan,

Statistics. Is it OK I sent it in CSV format?

Below few improvement numbers are presented. Next formulae was used:
Improvement = (OldTime/NewTime - 1)*100%:

1. Overall (even with cases when there is nothing to merge): 21%

2. When >100 functions were merged: 95%
(this is what I meant in previous post, when there is job for -mergefunc, new patch does it 2 times faster)

3. >500 functions (though there are only 3 modules): 127%.

4. Total number of functions: ~60000
(currently this number is absent in statistics, it just based on previous analysis)

5. Functions were merged with patch: 8342
6. Functions were merged without patch: 8345

7. So taking into account #5 and #6, we can conclude, that MergeFunctions catches about 10-14% of cases (though I need to recheck this number).

Since it is not possible to establish order relation on cross-referenced functions, here we can see that new patch can't merge 3 functions. Though it is only 0.03%.

All .ll files were token from test-suite object dir.

I'm working on more detailed statistics. Once statistics script is written, it is easy to bring more details into this research.

P.S.: New patches are coming also...

-Stepan.

Stepan Dyatkovskiy wrote:

real-set-report.csv (47 KB)

Hi Tobias.

So, what do you think?

If it means to much extra-job for your team, may be I can help you somehow? I really would like to.

-Stepan

Stepan Dyatkovskiy wrote:

Hello,
I'm glad to present you patch with fixes and clean-ups:

1. Comments at the top of the file were fixed. Also I added some new TODO's there. Fixed "Strange" tabbing.

2.
> whereas, since you never use Res unless it's non-zero, I think the
> following is more clear and concise:
>
> if (int Res = cmpNumbers(I1->getNumOperands(), I2->getNumOperands()))
> return Res;

3. MERGEFUNC_SANITY was converted into additional command-line option. So you can run it with command:
opt -mergefunc -mergefunc-sanity 100 my-module.ll
And it will check the pass for sanity using first 100 functions in your module.

4. Constants. I implemented cmpConstants in conservative way. So now it catches only subset of what old approach did, and yet it is about 99.9% of all cases.

Please find the patch in attachment for review.

-Stepan

Stepan Dyatkovskiy wrote:

mergefunctions-less-comparison-2014-01-27.patch (48.9 KB)

Hi Stepan,

Sorry for the delay. It's great that you are working on MergeFunctions as well and I agree, we should definitely try to combine our efforts to improve MergeFunctions.

Just to give you some context, the pass (with the similar function merging patch) is already being used in a production setting. From my point of view, it would be better if we focus on improving its capability of merging functions at this stage rather than on reduction of compilation time.

I'll give a brief overview of how our similar function merging algorithm works (you can also watch the presentation from the US LLVM conference to hear me explain it using an animation).

1. Hash all functions into buckets

In each bucket, separately:

  2. Compare functions pair-wise and determine a
     similarity metric for each pair (%age of equivalent instructions)

  3. Merge identical functions (with similarity = 100%), update call
     sites for those functions.

  4. If the updates of call sites have touched other functions,
     go back to step 2 and re-compare *only* those functions to all
     others in their buckets.

  Finally,

  5. Form groups of similar functions for merging:
     a) Pick most similar pair (A,B)
     b) Find all functions C that are also similar to B but are not
        more similar to some other function D.
     c) Merge A with B and all the C's.
     Repeat until there are no more functions to merge.

As you can see, we need to compute a similarity measure for each pair of functions in a bucket. If I understand correctly, your patch reduces compile-time by avoiding comparisons of functions that cannot be identical. That's exactly what you want to do if you only want to merge identical functions. However, because we also need to determine whether functions are merely *similar*, I can't see how your idea could be applied in that case.

Looking at your statistics, I see that you are considering a very large number of functions. I don't know which benchmarks you are using, but I doubt that many of these are worth merging. For instance, you rarely gain a code size benefit from merging functions with just a few instructions. Our patch takes care of this using thresholds. It's worth looking at actual code size reductions, rather than just numbers of functions merged/ compilation time spent comparing functions. It turns out that the number of functions in each bucket is really quite small once you apply some heuristics (and could still be further reduced!).

Given your experience with MergeFunctions, it would be really great if you could review our patch and also try it out on your benchmarks.

Tobias

Hi Tobias,

As you can see, we need to compute a similarity measure for each pair of
functions in a bucket. If I understand correctly, your patch reduces
compile-time by avoiding comparisons of functions that cannot be
identical. That's exactly what you want to do if you only want to merge
identical functions. However, because we also need to determine whether
functions are merely *similar*, I can't see how your idea could be
applied in that case.

Did you read my previous post to you? What do you thing about tree advantages for your approach that were explained there?

Looking at your statistics, I see that you are considering a very large
number of functions. I don't know which benchmarks you are using, but I
doubt that many of these are worth merging. For instance, you rarely
gain a code size benefit from merging functions with just a few
instructions. Our patch takes care of this using thresholds. It's worth
looking at actual code size reductions, rather than just numbers of
functions merged/ compilation time spent comparing functions. It turns
out that the number of functions in each bucket is really quite small
once you apply some heuristics (and could still be further reduced!).

Based on my previous statistics average size of merging function is about 10-20 instructions. Anyway I will present new statistics with sizes included. I also will publish statistics for you approach.

I talked with Nick (CC’d) about mergefunc at one of the socials, and he had some really neat ideas for speeding this up which might complement or supercede the approach that you are proposing (also the one that Tobias is proposing). For example, he had one suggestion based on callgraph SCC’s which was really neat.

Nick, can you maybe give some more details on the ideas you told me and/or provide some feedback about the directions that Stepan (this thread) and Tobias (<http://llvm-reviews.chandlerc.com/D2591>) are proposing?

– Sean Silva

Hi Stepan,

Sorry for the delay. It's great that you are working on MergeFunctions as
well and I agree, we should definitely try to combine our efforts to
improve MergeFunctions.

Just to give you some context, the pass (with the similar function merging
patch) is already being used in a production setting. From my point of
view, it would be better if we focus on improving its capability of merging
functions at this stage rather than on reduction of compilation time.

I'll give a brief overview of how our similar function merging algorithm
works (you can also watch the presentation from the US LLVM conference to
hear me explain it using an animation).

1. Hash all functions into buckets

In each bucket, separately:

2. Compare functions pair-wise and determine a
    similarity metric for each pair (%age of equivalent instructions)

3. Merge identical functions (with similarity = 100%), update call
    sites for those functions.

4. If the updates of call sites have touched other functions,
    go back to step 2 and re-compare *only* those functions to all
    others in their buckets.

Finally,

5. Form groups of similar functions for merging:
    a) Pick most similar pair (A,B)
    b) Find all functions C that are also similar to B but are not
       more similar to some other function D.
    c) Merge A with B and all the C's.
    Repeat until there are no more functions to merge.

As you can see, we need to compute a similarity measure for each pair of
functions in a bucket. If I understand correctly, your patch reduces
compile-time by avoiding comparisons of functions that cannot be identical.
That's exactly what you want to do if you only want to merge identical
functions. However, because we also need to determine whether functions are
merely *similar*, I can't see how your idea could be applied in that case.

Yeah, the existing pass only tries to merge identical functions (identical
in machine code). I'm wondering if the "similar" function merging would be
better as a separate pass (I'm genuinely wondering; I have no clue). I
would wait for Nick's feedback.

-- Sean Silva

Hello Sean and Tobias,

Sean,
Thank you. Could you describe Nick's ideas in few words or give me links to your discussion, so I could adapt my ideas to it.

Tobias,
Your patch fails on several modules in my benchmark (73 of ~1800 tests). I have sent one as attachment.

See statistics files for more details, all the .ll files you could simply find in test-suite object directory (after "make TEST=nightly report").
I have attached script that launches benchmark. Example of usage (you need Linux OS):
bash test-scipt.sh report.txt opt-no-patch opt-with-patch

Some numbers.

Total number of functions: 52715

Below, by default, I have excluded modules where your patch fails.

Functions merged
Original version: 1113
Order relation patch: 1112
Similar merging patch: 541

Functions merged (including failed tests)
Original version: 8398
Order relation patch: 8395

Summary files size
Initial: 163595634 bytes.
After merging with order relation patch: 163147731 bytes.
After similar merging patch: 162491905 bytes.

Summary files size (including failed tests)
Initial: 250242266 bytes.
Original version: 247168198 bytes.
Order relation: 247175650 bytes.

Time. I measured with "time" utility, and used "real (wall clock) time used by the process".

Summary time spent
With existing version: 28.05 secs.
With order relation patch: 28.19 secs.
Similar merging patch: 28.61 secs.

Summary time spent (including failed tests)
With existing version: 41.74 secs.
With order relation patch: 36.35 secs.

-Stepan

Sean Silva wrote:

test-scipt.sh (2.04 KB)

report.simmerge.txt (86.8 KB)

report.order.txt (87.4 KB)

sqlite3.ll.tgz (909 KB)

Hello Sean and Tobias,

Sean,
Thank you. Could you describe Nick's ideas in few words or give me links
to your discussion, so I could adapt my ideas to it.

Sorry I haven't had time to write it up. The essential idea is that we use
partition-refinement to determine the differences between a group of
functions instead of using pair-wise comparison.

Imagine you want to compare long sequences of numbers. You can grab a pair
of sequences and run down them until they differ or you reach the end, and
move on to another pair of sequences. With partition refinement you start
with the first number in all the sequences and bucket them. Say, three 1's,
two 2's, and an 8. You've just formed 3 buckets. For each bucket (with more
than 1 member), do it again with the next number in the sequence. This gets
you an equivalence comparison at less than the complexity of pairwise
comparisons. (The buckets are partitions, and you're refining them as you
read more data. Partition refinement is the dual of union-find, which we
use quite heavily in the compiler.)

I haven't figured out how this stacks up against Stepan's sorting with <=>
comparison. I can probably do that, I just haven't spent the time thinking
about it.

I also haven't thought through how to integrate it with "near miss"
comparison. My current leading idea is that you do the same thing, but make
a record of what makes two buckets similar. (Note that the wrong solution
is to store a per-function-pair list of similarities/differences.)

Nick

Tobias,

Hi all,
Please find the updated patch in attachment:
* Added some comments.
* Fixed some typos.

-Stepan

Nick Lewycky wrote:

mergefunctions-less-comparison-2014-01-31.patch (48.8 KB)