I have a pass that add a function in the module where the “main” is. Then, I have the same function declared as weak symbol in a shared library that I load at runtime with LD_PRELOAD.
When I run a program the weak symbol function gets called over the function was added by the pass.
Is it a correct behavior? Is there a way to call the strong symbol function when it’s present in the module?
In other words, when I don’t apply the pass I want the program to call the weak symbol function, otherwise when I apply the pass I want the program to call the function added by the pass.
I don’t expect this, but there are possible subtleties: for instance who is calling the function? Is the call site in the dylib? Is it in the same translation unit as the definition? (It could have been inlined for instance). Is the library built with protected/hidden visibility? Which platform are you on?
Also, that’s not really directly related to LLVM or the fact that you add the symbol in a pass. You should be able to reproduce with two trivial files: main.c that defines main() and foo(), and lib.c that defines a weak foo() and is built as a library. Can you reproduce with such setup? Otherwise something else is going on in your build…
ELF dynamic loaders generally do not distinguish between weak and strong symbols. They only look at the visibility, which can be internal, hidden, default, or protected.
If you’re trying to allow the main program to customize some aspect of your LD_PRELOAD’ed tool, you probably want to use an extern weak symbol. I forget the details on how to do this, but it looks something like this:
extern void attribute((weak)) myhook(void);
…
if (&myhook) {
myhook();
}
ELF dynamic loaders generally do not distinguish between weak and strong symbols. They only look at the visibility, which can be internal, hidden, default, or protected.
Was it always the case?
The only thing I find in the doc is the description of the environment variable: LD_DYNAMIC_WEAK “(glibc since 2.1.91) Allow weak symbols to be overridden (reverting to old glibc behaviour).”
I just post it here it anyone is interested as well:
"The concept of the weak symbol is that the symbol is marked as a lower priority and can be overridden by another symbol. Only if no other implementation is found will the weak symbol be the one that it used.
The logical extension of this for the dynamic loader is that all libraries should be loaded, and any weak symbols in those libraries should be ignored for normal symbols in any other library. This was indeed how weak symbol handling was originally implemented in Linux by glibc.
However, this was actually incorrect to the letter of the Unix standard at the time (SysVr4). The standard actually dictates that weak symbols should only be handled by the static linker; they should remain irrelevant to the dynamic linker (see the section on binding order below).
At the time, the Linux implementation of making the dynamic linker override weak symbols matched with SGI’s IRIX platform, but differed to others such as Solaris and AIX. When the developers realised this behaviour violated the standard it was reversed, and the old behaviour relegated to requiring a special environment flag (LD_DYNAMIC_WEAK) be set.”
Because, LD_PRELOAD load the library before every other library, it gives priority to the weak symbol in the dylib.
If I compile the program and link it against the shard library then I obtain the behavior that I wanted and can apply the trick that Reid suggested.