Std::make_unique<> and TargetOpcode

This is beginnerish but I thought other backends might find it useful.

modernize/replace reset() with std::make_unique<>

GlobalISel was designed+implemented circa 2015+16. LLVM then didn’t allow C++14. Consequently (and for example) the AArch64 backend’s Subtarget constructor looks like:

CallLoweringInfo.reset(new AArch64CallLowering(*getTargetLowering()));
InlineAsmLoweringInfo.reset(new InlineAsmLowering(getTargetLowering()));
Legalizer.reset(new AArch64LegalizerInfo(*this));
auto *RBI = new AArch64RegisterBankInfo(*getRegisterInfo());
InstSelector.reset(createAArch64InstructionSelector(
      *static_cast<const AArch64TargetMachine *>(&TM), *this, *RBI));

The LLVM codebase now allows C++14 and so reset() can be replaced by std::make_unique<>. To do this, AArch64InstructionSelector also needs to be refactored to have a AArch64InstructionSelector.h file. clang-tidy even offers a check for the reset() → make_unique<> conversion, modernize-make-unique. I think the result looks cleaner:

CallLoweringInfo = std::make_unique<AArch64CallLowering>(*getTargetLowering());
Legalizer        = std::make_unique<AArch64LegalizerInfo>(*this);
RegBankInfo      = std::make_unique<AArch64RegisterBankInfo>(*getRegisterInfo());
InstSelector     = std::make_unique<AArch64InstructionSelector>(
                           *static_cast<const AArch64TargetMachine *>(&TM),
                           *this, (AArch64RegisterBankInfo&) *RegBankInfo);

TargetOpcode shouldn’t appear in a backend

TargetOpcode is a namespace for “Invariant opcodes: All instruction sets have these as their low opcodes.” However, these opcodes are also copied into the backend’s namespace by TableGen, for example, into AArch64GenInstrInfo.inc.

Using both namespaces gets a little messy. As an example, AArch64 uses both AArch64::CFI_INSTRUCTION and TargetOpcode::CFI_INSTRUCTION. You can also get namespace collisions for using both at the same time which you can’t if you restrict the backend to its own namespace.

So I think that the TargetOpcode namespace shouldn’t appear in a backend. A backend should just use its own namespace. It’s not hard to make this change (grep) but I suppose a clang-tidy rule restricted for backends could help.

I also use using namespace wherever possible to further simplify the code.

using namespace llvm;
using namespace AArch64;

I think this is not an improvement - using TargetOpcode makes it clear that an opcode is generic, rather than target-specific. That said, if you using namespace AArch64;, you should be able to use TargetOpcode::CFI_INSTRUCTION explicitly, I think?

Well, the ARM backend uses ISD::INLINEASM_BR, ARM::INLINEASM_BR and TargetOpcode::INLINEASM_BR. The ARM backend also has using namespace TargetOpcode;

Despite ISD and TargetOpcode namespaces both having INLINEASM_BR, they are different namespaces. Of course, they always have to be distinguished.

But the AArch64 namespace is a superset of TargetOpcode. I think distinguishing them is less necessary.

BuildMI(MBB, MBBI, dl, TII.get(TargetOpcode::CFI_INSTRUCTION))
BuildMI(MBB, MBBI, dl, TII.get(AArch64::CFI_INSTRUCTION))
BuildMI(MBB, MBBI, dl, TII.get(CFI_INSTRUCTION)) // this is not used in the AArch64 backend

These are all the same. But for myself, I prefer the last one and I’m not seeing a useful distinction between the first two. Tablegen is doing us a favor by copying TargetOpcode into the backend’s namespace and then I think using namespace can do us another favor.

But maybe it comes down to personal preference.

My opinion (and it’s only that, an opinion) is that it is useful in many places in the backend to be clear when an instruction is target-generic and so might be dealt with specially by a generic backend pass, vs when it is target-specific. This is not always clear from the name of opcode - GlobalIsel uses a G_ prefix for the generic operations it is selecting from, but there’s no such consensus for non-GlobalIsel target-generic names.