[RFC] Reworking the TargetMachine Interface

The TargetMachine class describes all specifications of the machine being targeted in LLVM. Currently, the TargetMachine class is intended to represent specifications of targets that can take LLVM IR and generate object files/MC representation from it. There is another class that inherits directly from TargetMachine called LLVMTargetMachine, which represents targets that use LLVM’s target-independent code generator library (libCodeGen) to implement their code generation process.

This class hierarchy design predates the introduction of the MC layer. It was intended to allow targets to use their own code generation library instead of the CodeGen library provided by LLVM. In the recent years, however, it has caused a few headaches:

  1. The TargetMachine class is created using a factory method from the Target class. As this factory method only returns a TargetMachine pointer, and with no dyn_cast support for telling if a TargetMachine is an LLVMTargetMachine, the LLVM code base is filled with instances of static_casts from TargetMachine to LLVMTargetMachine. It has worked for now since all in-tree targets in LLVM are indeed LLVMTargetMachines, but this is sub-optimal and verbose.
  2. The separation between TargetMachine and LLVMTargetMachine is not correctly enforced to begin with; More specifically, as time went by, functions were added to TargetMachine that directly relied on the Target being implemented using the CodeGen library; Some examples include the MachineFunctionInfoYaml interface and passing of an optional llvm::MachineModuleInfoWrapperPass to the addPassesToEmitFile function.
  3. The current interface is causing some issues with refactoring certain aspects of the CodeGen library, with one example being Make MMIWP not have ownership over MMI + Make MMI Only Use an External MCContext by matinraayai · Pull Request #105541 · llvm/llvm-project · GitHub. This specific issue can also interfere with current efforts to port CodeGen to use the new pass manager.

The following PR aims to overhaul the TargetMachine interface: Overhaul the TargetMachine and LLVMTargetMachine Classes by matinraayai · Pull Request #111234 · llvm/llvm-project · GitHub. The PR merges all functionality from LLVMTargetMachine and TargetMachine classes inside a single TargetMachine class; Which means TargetMachine now includes functionality that directly relates to IR/MIR/MC. Any TargetMachine that doesn’t want to use the LLVM CodeGen library can simply not implement
the related interface functions.

With this change, the LLVMTargetMachine interface will be renamed to CodeGenTargetMachine, and its interface file is moved under CodeGen instead of being in the same file as TargetMachine. Instead of representing all TargetMachines implemented using LLVM’s CodeGen library, it’s simply a set of function implementations of the TargetMachine interface. This allows more flexibility regarding new target implementations without breaking current targets: Current targets all inherit CodeGenTargetMachine to get access to
CodeGen functinonality shared among all in-tree LLVM targets, and new targets have the option not to use CodeGenTargetMachine at all, and create a new target from scratch that still uses the LLVM CodeGen library if desired.

This change seem to work with in-tree Targets and does not break the current library layering (i.e. there is no circular dependency between MC/Target/CodeGen); However, we suspect there might be out-of-tree LLVM targets that might be affected with this change, hence we request the community to discuss this change before it is merged.