I would like to implement GCC ifunc attribute support in Clang and LLVM. At the moment Clang ignores the attribute with a warning. On LLVM IR level there is no support for ifunc. But there is some support for ELF symbol type
@gnu_indirect_function in ELF reader/writer/target asm parser. This RFC is looking for thoughts and suggestions about representation of ifunc in LLVM IR.
Alternatives for ifunc representation:
- Add a boolean flag isIFunc to
From implementation perspective ifunc is very close to a function alias that points to resolver function as a target. During asm emissions
@gnu_indirect_function symbol attribute is added. In printed LLVM it could look like this:
@foo = alias ifunc i32 (i32), bitcast (i64 ()* @foo_ifunc to i32 (i32)*)
- minimal changes LLVM code base
- ifunc is not really an alias, if some code passes through alias and starts using resolver function instead of ifunc, it is always an error
- in particular in my prototype I had to add weak linkage to inhibit optimizations (it prevents them because LLVM assumes that linker could change the symbol but it could be fixed without changing linkage)
- Extract common parts for alias and ifunc into a base class
Both alias and ifunc will be derived classes. Similar to first proposal add
@gnu_indirect_function symbol attribute. In printed LLVM it could look like:
@foo = ifunc i32 (i32), i64 ()* @foo_ifunc
- no confusion for existing code
- cleaner from design perspective
- Add new type of Global i.e. more textual changes
- Some potential code duplication (need prototyping to estimate)
- Emit global asm statement from Clang
Generate global asm statement like
__asm__ (".type resolver_alias_name, @gnu_indirect_function") in Clang for alias generated for resolver function.
- (almost?) no changes in LLVM
- ifunc is not marked in LLVM IR at all so some hacks are required to detect and inhibit optimizations in LLVM for such alias (it is always wrong to use resolver instead of ifunc they even have different function type)
- asm statement in general not very reliable mechanism in general, highly depends on expected asm generated by compiler
- using low-level platform dependent information in Clang
I prototyped first approach, changes in http://reviews.llvm.org/D15525. But got feedback that I should use second approach instead + request to write RFC about this IR extension for broader discussion. I’m convinced that the second approach is cleaner and if there is no better suggestion or push back, I’m going to prepare such patch. Third approach mostly added for completeness.
I would probably go with 2.
3 is particularly undesirable as it make the ifunc really hard to
differentiate from an alias.
Are you going to have to teach LLVM how to look through ifuncs, or is it OK if they are totally opaque?
For example, with attribute((target)), you might have one target-specialized function call another. Is it important that we be able to optimize away the dynamic dispatch there or not?
Either way, I think a new IR construct is the way to go.
I think 2 is the only reasonable answer. And to answer Reid’s comment: yes, it might be nice if we could look through it for compiler generated resolver functions - something to keep in mind for the future.
I would like to support attribute((target)) later so ifunc won’t be opaque for compiler generated dispatchers.
Thank you all for the feedback!
I start prototyping second approach that makes ifunc new type of GlobalValue. And I would like to reconfirm how much we would like to share between aliases and ifuncs. For example, should Module have separate lists of aliases and ifuncs or they both should be in the same list of indirect symbols? If ifuncs and aliases are in separate lists, common base class is less useful and I found myself duplicating aliases code to support ifunc (that is relatively small code but here and there). On the other hand if I rename aliases to indirect symbols, all places that work with aliases will have to be updated to support this renaming and sometimes check if the object in the list is an alias or ifunc (i.e. it becomes close to first approach + bunch renaming). So I would like to hear community input on how separate aliases and ifuncs should be.
If changing nothing else I would probably go with another list since
that is what is done for functions/variables/aliases.
Having said that, more than once I wished we had an easy way to
iterate all GlobalValues or all GrobalObjects. Any thoughts on the
tradeoff having a single list with pointer to the start of each sub
list? Something like
Func1 <-> Func2 <-> Var1 <-> Var2 <-> Alias1 <-> Alias2 <-> IFunc1 <-> Ifunc2
and keep FirstFunc, FirstVar, FirstAlias and FirstIFunc pointers.
One place where having another list might be a bit annoying is the
symbol iterator of IRObjectFile.
From what I see putting ifunc into separate list is better aligned with LLVM design so I prepared CL that does it in http://reviews.llvm.org/D15525 Amount of changes is about 3-5x in comparison with using bool flag for GlobalAlias (first patchset).
Having all global objects in single list does make some sense to me and iterators could skip objects of uninteresting types. But I think such change is much bigger and much deeper. So I’m not sure that the result worse such major change in code base.
Bring this RFC up after long holiday break. I need more feedback on ifunc representation in llvm.
Patch that accumulates current feedback (i.e. ifunc is a new type of GlobalValue) http://reviews.llvm.org/D15525
If there is no high level feedback, please contribute to review.
Duncan, Eric, Rafael, Reid,
I would like to bring this topic back to the top and move on on ifunc support in LLVM. It looks like there is no objections again implemented approach and the patch itself. But also there is no clear mark “GO” for the patch. So if you have any concerns or comments please speak up. If you don’t, please speak up too! I’m on LLVM conference in Barcelona so we also have a chance to discuss it face to face.