TargetParser: auto-generation of RISCV CPU definitions

Hi!
I would like to generate the enumerations currently available under llvm/include/llvm/Support/RISCVTargetParser.def using tablegen and the TD file llvm/lib/Target/RISCV/RISCV.td . There is a dependency issue (see below) that requires to be addressed in the implementation. I have factored it out as a patch that moves TargetParser.cpp out of LLVMSupport into a new component called LLVMTargetSupport.

Let me know if the refactoring at D137516 and the autogeneration of the CPU refs at D137517 make sense.

Please note that from a user perspective, this is a NFCI proposal.

Thanks!

Francesco

Circular dependency issue

Using llvm-tblgen results in an LLVM build failure which this NFCI proposes to resolve. However:

  1. llvm-tblgen depends on LLVMSupport (see graph below).
  2. There is a circular dependency: we cannot generate any of the components in LLVMSupport with TableGen.

Proposed solution

  1. Extract the namespaces and functions of llvm/include/llvm/Support/TargetParser.h from LLVMSupport in a new LLVM component called LLVMTargetSupport
  2. LLVMTargetSupport would live under lib/TargetSupport , and its header files would live under llvm/include/llvm/TargetSupport.

Steps

  1. Move llvm/lib/Support/TargetParser.cpp and the API of llvm/include/llvm/Support/TargetParser.h in into the component LLVMTargetSupport under llvm/lib/targetSupport and llvm/include/llvm/TargetSupport/TargetParser.h, respectively (D137516)
  2. Generate the file llvm/include/llvm/TargetSupport/RISCVTargetParser.def file tablegen using lib/Target/RISCV/RISCV.td (D137517).

I have also been thinking about exactly this. There were some cleanups I needed to do, as I think for Arm, you need to take the code for Triple and Host with you (this may not be required for RISC-V). I’ll look at your patches this week and try to get mine posted too.

One thing I was considering while looking at this for ARM/AArch64: why generate this code via tablegen rather than simply checking in the generated code?

The obvious disadvantage to the tablegen approach is added complexity. There are a couple of potential advantages I can see:

  • Generated enums won’t go out of sync with data in arrays. This can be mitigated with things like static_assert though.
  • The data is accessible from other tablgen files. We don’t have a use for this (yet?), is there one for RISCV?

Here is my proposal for the same style of split, but for all targets, not just RISC-V: ⚙ D137838 [RFC][Support] Move TargetParsers to new component

This takes all of the Target Parsers, and puts them in the same, new component. The difficult parts are:

  • llvm/ADT/Triple.h is moved into this component because it depends on the Arm target parsers.
  • llvm/Support/Host.h is moved into this component because it depends on the x86 target parser.

This would then allow more targets than just RISC-V to use tablegen in their target parsers.

This has several dependent patches:

There are likely further cleanups I might do in future, given e.g. the x86 target parser has a copy of FeatureBitset from llvm/MC/SubtargetFeature.h. I think it might be better to have this data structure just live in llvm/TargetParser, maybe along with some of the other feature handling utility classes.

One thing I wasn’t sure about was whether we should be moving the target-specific Build Attribute handling code in Support into this component too. I think we can do that later if we choose to.

Comments welcome.

1 Like

Triple.h never belonged in ADT and several times I have had an impulse to move it to Support (where its .cpp file lives). Moving it to a Support-like library seems right to me.
I’m neutral on everything else.

Hi @tmatheson - I am not sure I understand what you mean here.

  1. Why would generated enums go out of sync with arrays? And if so, why would static asserts mitigate the issue?
  2. I didn’t change the structure of the tablegen files. What are you referring to when saying data is accessible from other tablgen files?

Francesco

Separating out a TargetParser library generally sounds good to me.

From looking at your change, I see that Triple.h is a very, very popular include. I think it would be really helpful to break this change up and land it in multiple stages. You can leave a forwarding header behind at llvm/ADT/Triple.h, and then delete it in a subsequent change. We can decide whether to keep the forwarding header in the next release or not by reverting or not reverting that change from the branch. I’m not sure if there’s a way to mark a header as deprecated.

Yep, I got the “forwarding header” feedback on the patch, I just haven’t got back to it yet.

About deprecating a header: I was thinking I’d put a #warning "…" in the forwarding headers before the forwarding-#include, but that’s probably not something to do immediately, given everywhere will still use them in the short term. I’ll keep thinking about the best timing of that.