When two functions are the same could clang optimize out one of them?

Hi,
When the generated instructions of two functions “f()” & “g()” are the same(when templates are used it isn’t unusual), could clang remove the function “f()” in the object file and translate “f()” calls to “g()” calls?

Thanks
–Yao

C and C++ require that almost all objects, including function objects, have different addresses. In principle, we can detect cases where a program doesn't rely on the address of a function; in other cases we can just emit one function as a tail call to the other. There's actually already an LLVM optimization that does something like this, although I believe it's not enabled by default because detecting the equivalence of functions is very difficult to do quickly.

John.

Templates aren't even needed. At least on x86, accessor functions like

int getFoo() const { return m_foo; }

all boil down to "return *(this + offset)". I've seen Visual C++ 7.1
merge these functions across different classes during optimization. I
was quite the WTF moment during debugging.

Csaba

When the generated instructions of two functions “f()” & “g()” are the same(when templates are used it isn’t unusual), could clang remove the function “f()” in the object file and translate “f()” calls to “g()” calls?

C and C++ require that almost all objects, including function objects, have different addresses.

This isn’t a big problem, just pad some nop/jmp instructions.

In principle, we can detect cases where a program doesn’t rely on the address of a function; in other cases we can just emit one function as a tail call to the other. There’s actually already an LLVM optimization that does something like this, although I believe it’s not enabled by default because detecting the equivalence of functions is very difficult to do quickly.

I know only a little about compilation theories, so the following possibly are just nonsense.
Detect equivalence of functions may be done at two different levels, one is at the IR level, the other is at the C++ function template level. The second one is useful because different instances of the same template should be the largest source of equivalence. To detect at the IR level: 1. get hash of a functions (can we??? I have no idea on this). 2. sort the hash 3. search & compare. To detect at the C++ function template level is much more complicated, I currently don’t know what situations should be considered.

This can't be represented as a single function at the IR level, and
the 'jmp to the other function' is already implemented for a few
backends when they see functions that consist of only a tail call. The
only reason you're not seeing this happening is probably because the
-mergefunc pass isn't run by default (as John mentioned).

The 'nop' variant could possibly be a nice optimization for tail calls
to local functions; I'm sure patches are welcome :).

Like I said, it’s a cost/benefit analysis. There’s less potential for compression than you think: a C++ program might have lots of templated functions, but most will be so tiny that there’s no profit in merging them, even when they can’t just be inlined everywhere. There’s profit in merging larger functions, but they’re also more expensive to analyze, especially as they tend to have calls to other functions, which we then need to recursively ensure the equivalence of.

Do you have a specific goal in mind for this optimization?

John.

FWIW, LLVM does have a function body merging pass (lib/Transforms/IPO/MergeFunctions.cpp). It isn't turned on by default yet, but hopefully will be in the LLVM 3.0 release.

-Chris