I’ve been investigating what is needed to ensure command line options are passed to the backend codegen passes during LTO and enable compiling different functions in a module with different command line options (see the links below for previous discussions).
The command line options I’m currently looking into are “-target-cpu” and “-target-feature” and I would like to get feedback about the approach I’ve taken (patches attached).
The attached patches make the following changes:
In TargetMachine::getSubtarget(const Function*) and MachineFunction’s constructor, use per-function subtarget object instead of TargetMachine’s (module-level) subtarget object. This allows passes like selection dag to switch the target on a per-function basis.
Define class TargetOptions::Option, which records whether an option has appeared on the command line along with the option’s value. Long term, this might not be the best solution and I expect it will be modified or replaced when the new command line infrastructure becomes available.
Fix X86’s subtarget lookup to override the function attributes if the corresponding options were specified on the command line.
FIx clang to embed “-target-cpu” and “-target-feature” attributes in the IR.
I’ve tested the changes I made and confirmed that target options such as “-mavx2” don’t get dropped during LTO and are honored by backend codegen passes.
This is my plan for the remaining tasks:
FIx other in-tree targets and other code-gen passes that are still using TargetMachine’s subtarget where the per-function subtarget should be used.
Fix TargetTransformInfo to compute the various code-gen costs accurately when subtarget is switched on a per-function basis. One way to do this is to make the pointer or reference to the Function object available to the various subclasses of TargetTransformInfo by defining the necessary functions in FunctionTargetTransformInfo (similar to the changes made in r218004). However, passes like Inliner that are not function passes cannot access FunctionTargetTransformInfo, so it has to be done in a different way.
Forbid inlining functions that have incompatible cpu and feature attributes. It seems the simplest approach is to allow inlining only if the cpu and feature attributes match exactly, but it’s also possible to relax this restriction.
cpufs_llvm1.patch (11.8 KB)
cpufs_clang1.patch (3.63 KB)