TargetParser: auto-generation of RISCV CPU definitions

Hi!
I would like to generate the enumerations currently available under llvm/include/llvm/Support/RISCVTargetParser.def using tablegen and the TD file llvm/lib/Target/RISCV/RISCV.td . There is a dependency issue (see below) that requires to be addressed in the implementation. I have factored it out as a patch that moves TargetParser.cpp out of LLVMSupport into a new component called LLVMTargetSupport.

Let me know if the refactoring at D137516 and the autogeneration of the CPU refs at D137517 make sense.

Please note that from a user perspective, this is a NFCI proposal.

Thanks!

Francesco

Circular dependency issue

Using llvm-tblgen results in an LLVM build failure which this NFCI proposes to resolve. However:

  1. llvm-tblgen depends on LLVMSupport (see graph below).
  2. There is a circular dependency: we cannot generate any of the components in LLVMSupport with TableGen.

Proposed solution

  1. Extract the namespaces and functions of llvm/include/llvm/Support/TargetParser.h from LLVMSupport in a new LLVM component called LLVMTargetSupport
  2. LLVMTargetSupport would live under lib/TargetSupport , and its header files would live under llvm/include/llvm/TargetSupport.

Steps

  1. Move llvm/lib/Support/TargetParser.cpp and the API of llvm/include/llvm/Support/TargetParser.h in into the component LLVMTargetSupport under llvm/lib/targetSupport and llvm/include/llvm/TargetSupport/TargetParser.h, respectively (D137516)
  2. Generate the file llvm/include/llvm/TargetSupport/RISCVTargetParser.def file tablegen using lib/Target/RISCV/RISCV.td (D137517).

I have also been thinking about exactly this. There were some cleanups I needed to do, as I think for Arm, you need to take the code for Triple and Host with you (this may not be required for RISC-V). I’ll look at your patches this week and try to get mine posted too.

One thing I was considering while looking at this for ARM/AArch64: why generate this code via tablegen rather than simply checking in the generated code?

The obvious disadvantage to the tablegen approach is added complexity. There are a couple of potential advantages I can see:

  • Generated enums won’t go out of sync with data in arrays. This can be mitigated with things like static_assert though.
  • The data is accessible from other tablgen files. We don’t have a use for this (yet?), is there one for RISCV?

Here is my proposal for the same style of split, but for all targets, not just RISC-V: ⚙ D137838 [RFC][Support] Move TargetParsers to new component

This takes all of the Target Parsers, and puts them in the same, new component. The difficult parts are:

  • llvm/ADT/Triple.h is moved into this component because it depends on the Arm target parsers.
  • llvm/Support/Host.h is moved into this component because it depends on the x86 target parser.

This would then allow more targets than just RISC-V to use tablegen in their target parsers.

This has several dependent patches:

There are likely further cleanups I might do in future, given e.g. the x86 target parser has a copy of FeatureBitset from llvm/MC/SubtargetFeature.h. I think it might be better to have this data structure just live in llvm/TargetParser, maybe along with some of the other feature handling utility classes.

One thing I wasn’t sure about was whether we should be moving the target-specific Build Attribute handling code in Support into this component too. I think we can do that later if we choose to.

Comments welcome.

1 Like

Triple.h never belonged in ADT and several times I have had an impulse to move it to Support (where its .cpp file lives). Moving it to a Support-like library seems right to me.
I’m neutral on everything else.

Hi @tmatheson - I am not sure I understand what you mean here.

  1. Why would generated enums go out of sync with arrays? And if so, why would static asserts mitigate the issue?
  2. I didn’t change the structure of the tablegen files. What are you referring to when saying data is accessible from other tablgen files?

Francesco

Separating out a TargetParser library generally sounds good to me.

From looking at your change, I see that Triple.h is a very, very popular include. I think it would be really helpful to break this change up and land it in multiple stages. You can leave a forwarding header behind at llvm/ADT/Triple.h, and then delete it in a subsequent change. We can decide whether to keep the forwarding header in the next release or not by reverting or not reverting that change from the branch. I’m not sure if there’s a way to mark a header as deprecated.

Yep, I got the “forwarding header” feedback on the patch, I just haven’t got back to it yet.

About deprecating a header: I was thinking I’d put a #warning "…" in the forwarding headers before the forwarding-#include, but that’s probably not something to do immediately, given everywhere will still use them in the short term. I’ll keep thinking about the best timing of that.

There is now a new TargetParser component! I landed my patch in rGf09cf34d0062 - this should unblock any work to generate target parser definitions with tablegen, like this for risc-v.

As suggested, I have left forwarding headers in place for all the *.h files. The *.def files I didn’t leave forwarding headers for, but those includes should have infected many fewer places. I will spend some time over the next few months getting rid of references to the forwarding headers and instead point at the real version of the headers. However, I do think the forwarding headers should stay for LLVM 16 at least, as I realise this was landed closer to a release than might be easy to resolve.

There were still some buildbot failures from the change, but I’ve tried to link to any minor fixups from the change itself (D137838) so they are easier to trace. Most were missed TargetParser dependencies.

Downstream Impact

I do expect downstream impact (which I’ve spent a few hours dealing with in a downstream Arm compiler). The most unexpected place for these is if you use any non-inline Triple methods, you will need the TargetParser llvm component in your dependencies, but you’ll of course also need it for any other uses of the TargetParser and the related classes.

The llvm/include/llvm/Support/*.h files are also very likely to give merge conflicts if you have downstream changes, because git doesn’t really detect the move and replace sensibly. The way to resolve these is to move the downstream versions of these files to the new location (updating includes and the include guards), and then leave the old location as a forwarding header, as it is on main now.

This is a great step forward.

Which reminds me, someone asked about doing the same thing for AVR about a year ago - [llvm-dev] AVR target, add llvm/Support/AVRTargetParser. If there is still interest there, it’s now possible!

The patch that does automatic generation is now dependent on @lenary 's version of the split of TargetParser from LLVMSupport. May I get another round of review at ⚙ D137517 [TargetParser] Generate the defs for RISCV CPUs using llvm-tblgen., please?

Thank you!

And now the time comes: yesterday and today, I landed a lot of NFC patches to remove the forwarding headers and replace their uses with the new header locations. As of 62c7f035b4 this is mostly complete.

I have one forwarding header yet to finish, but this is depended on by isl, which Polly contains a copy of in the upstream repo. In this case I’m going to add a warning if you use the forwarding header, instead of deleting it quite yet - but I still intend to some time in the next few weeks. I’ve sent the maintainer of isl an email about this, and I will ask the Polly maintainers to update their copy of isl when this is done.

1 Like