Two questions about MergeFunctions pass

Hi Nick.

Can you help me sort some things out in MergeFucntions pass. While I was working on MergeFunctions pass I got several questions. I hardly tried to find all the answers by myself, but there are still two questions without answer.

It is about merging functions itself (not comparing).

First question is:
Why sometimes we use RAUW and sometimes replaceDirectCallers. Would you help me with explanation how "overridability" and possibility to create aliases affects on our decision what to use RAUW or replaceDirectCallers?

And the second question.
There is a case when both "F" and "G" are overridable and target supports global aliases. We replace "G" with alias to "F". Its ok. But why we also replace "F" with alias to "F"? Suppose, for keeping callers equal when it is possible. Though aren't callers equal if caller "A" calls "F", and caller "B" calls "alias-to-F"? Perhaps there are more reasons?

Thanks you very much!
-Stepan

Hi Nick.

Can you help me sort some things out in MergeFucntions pass. While I was
working on MergeFunctions pass I got several questions. I hardly tried to
find all the answers by myself, but there are still two questions without
answer.

It is about merging functions itself (not comparing).

First question is:
Why sometimes we use RAUW and sometimes replaceDirectCallers. Would you
help me with explanation how "overridability" and possibility to create
aliases affects on our decision what to use RAUW or replaceDirectCallers?

If we know that we're going to replace F with a thunk to G, and F is *not*
weak, why should we ever knowingly call F? It would be faster to just go
and call G directly. RAUW would also affect non-calling users who may end
up comparing the addresses of functions.

(This whole thing predates unnamed_addr, but also, I was confused about
GlobalAlias when writing it. I think I started in the belief that aliases
had distinct addresses, then later learned they didn't?)

And the second question.

There is a case when both "F" and "G" are overridable and target supports
global aliases. We replace "G" with alias to "F". Its ok. But why we also
replace "F" with alias to "F"? Suppose, for keeping callers equal when it
is possible. Though aren't callers equal if caller "A" calls "F", and
caller "B" calls "alias-to-F"? Perhaps there are more reasons?

A function which is weak may be replaced during linking with another TU
that has a strong definition. Suppose we know that the definitions are
equivalent if neither one is overridden, but we need to allow for either
one to be overridden without affecting the other?

Suppose functions F and G are equivalent and weak. What we do is create a
third function H, and make it strong. Then we replace F and G with weak
aliases (or thunks) to H. (To make this more efficient, the implementation
actually repurposes F to be H and then deletes G and writes out two
thunks/aliases. That's the "replace 'F' with alias to 'F'" you're seeing.)

Nick

Hello Nick,

Thank you for explanations. Just now I have finished detailed MergeFunctions pass description (as a part of my scientific work). I have also attached it to this post. If you would like, we could also publish it on llvm site as pass documentation.

Thanks!
-Stepan.
Nick Lewycky wrote:

MergeFunctions.doc.tar.gz (20 KB)