TableGen: spring cleaning, new features for "functional programming"

Hi all,

While working on some features for the AMDGPU backend -- specifically, explicit address components for image intrinsics, which involves generating both

(a) a lot of intrinsics with different but somewhat regular parameter types, and

(b) the patterns to select instructions for those intrinsics

-- I got fed up with a lot of the TableGen bugs and limitations and set out to fix things.

I now have an already rather long list of ~30 patches which do a bunch of things in lib/TableGen/ such as:

- better and earlier error messages
- cleanup type checking
- cleanup variable resolving
- cleanup record instantiation
- late generation of anonymous records that appear in expressions
- cleanup !foreach
- add !foldl
- add !isa<type>(...) and make !cast more useful
- add !dag builtin to generate DAG nodes from lists
- some other minor new built-ins like !size, !le, !lt, !ge, !gt
- add a defset mechanism for collecting records which can then later be looped over in a foreach statement
- make foreach statements in multiclass work
- probably more...

An earlier version of the patches is here if you already want to take a look: https://cgit.freedesktop.org/~nh/llvm/log/?h=mimg

My plan is to clean those up over the next days and weeks and submit them to Phabricator.

With the exception of !foreach, which doesn't actually seem to be used, all the changes should be backward compatible. In fact, most of the larger changes are simply refactoring TableGen in a way that makes things work that look like they should already work but don't.

I would appreciate feedback and help in reviewing the patches when they arrive!

Thanks,
Nicolai

P.S.: I'm going to document some of my findings and changes on my blog, the first entry is here: https://nhaehnle.blogspot.de/2018/02/tablegen-1-what-has-tablegen-ever-done.html

Hi all,

While working on some features for the AMDGPU backend – specifically,
explicit address components for image intrinsics, which involves
generating both

(a) a lot of intrinsics with different but somewhat regular parameter
types, and

(b) the patterns to select instructions for those intrinsics

– I got fed up with a lot of the TableGen bugs and limitations and set
out to fix things.

​It must be the season. :slight_smile: I’ve been tactically cleaning up the most annoying (for me) tablegen issues I’ve ran into, but your patch set makes things way more interesting. ​

I now have an already rather long list of ~30 patches which do a bunch
of things in lib/TableGen/ such as:

  • better and earlier error messages

​+100.

  • cleanup type checking
  • cleanup variable resolving
  • cleanup record instantiation

​That can mean whole lot of different things. ​Do you have an overarching plan how tablegen is supposed to do all of the above in a consistent manner? Just making it different may not improve things much.

  • late generation of anonymous records that appear in expressions

​That would help, somewhat, though the biggest problem (IMO) with the anonymous records is that when they are instantiated using multiclass template parameters, the inner records are not folded because the values are not known yet and they may never be known because instantiated anonymous record has no access to template arguments that may be used by its fields. If you moved instantiation of anonymous records to the point where the instances of the class/multiclass that uses them are generated, that would help a lot.​

  • cleanup !foreach

​+1 for having scoped temporary variable. ATM tablegen can use a class field for that, but using it in the global scope still needs a record for it. Things also go wrong when one accidentally uses a record/field with a known value.​

  • add !foldl
  • add !isa(…) and make !cast more useful

​Nice.​

  • add !dag builtin to generate DAG nodes from lists

​Yes, please! Constructing dags using !foreach and !con()​ is a pain.

  • some other minor new built-ins like !size, !le, !lt, !ge, !gt
  • add a defset mechanism for collecting records which can then later be
    looped over in a foreach statement
  • make foreach statements in multiclass work
  • probably more…

​I’ll list few things that I have on my list of tablegen annoyances to fix:​

  • ​!subst() needs some minor tweaks when it’s applied to dag from !foreach – it wants specific types, but I sometimes want to replace a dag argument (a record) with another dag​.
    ​- I want some sort of !macro() operator that will never be folded by itself, but will instead literally substitute its value when it’s used, so it will have access to the multiclass template arguments, class fields, etc.​ We’re currently using anonymous records for similar purposes, but those are instantiated at the wrong time, so it’s often impossible to use them as sort of ‘subroutines’. Your late generation change you’ve mentioned above may help, but I’m not sure if that’s the use pattern you had in mind.

An earlier version of the patches is here if you already want to take a
look: https://cgit.freedesktop.org/~nh/llvm/log/?h=mimg

My plan is to clean those up over the next days and weeks and submit
them to Phabricator.

​I’d be happy to help reviewing them.​

With the exception of !foreach, which doesn’t actually seem to be used,

I do have few outstanding changes for NVPTX where I rely on !foreach, but I think it should not be too hard to adapt them.​

all the changes should be backward compatible. In fact, most of the
larger changes are simply refactoring TableGen in a way that makes
things work that look like they should already work but don’t.

I would appreciate feedback and help in reviewing the patches when they
arrive!

​Count me in. And thank you for doing this work!

–Artem​

Hi Artem,

Thank you for your encouraging reply :slight_smile:

I have now cleaned up the first batch of changes and put them on Phabricator here: https://reviews.llvm.org/D43552

I've tried to keep individual changes small, and I've verified with `git rebase -x` that the build is good after each change. This first batch does not cause any changes in backend's generated files.

[snip]> - cleanup type checking> - cleanup variable resolving> - cleanup record instantiation> > ​That can mean whole lot of different things. ​Do you have an > overarching plan how tablegen is supposed to do all of the above in a > consistent manner? Just making it different may not improve things much.
Right, and we'll have to see in reviews. But just to give you an example, for the last point (cleaning up record instantiation), I ended up removing more than 150 lines of code while simultaneously enabling simple cases of foreach inside of multiclasses. I think that speaks for itself.

When it comes to variable resolving, I propose a simple resolver interface, see https://reviews.llvm.org/D43564.

This allows multiple template arguments to be resolved simultaneously, which should be a small performance improvements aside from the more subjective cleanup of the IMO odd resolveReferences interface. At the same time, this allows proper variable hiding when resolving inside nested !foreachs.

The type system stuff is more vague, and is mostly about making sure that we actually get the right types everywhere. For example, list types end up being incorrect all over the place at the moment.

    - late generation of anonymous records that appear in expressions

​That would help, somewhat, though the biggest problem (IMO) with the anonymous records is that when they are instantiated using multiclass template parameters, the inner records are not folded because the values are not known yet and they may never be known because instantiated anonymous record has no access to template arguments that may be used by its fields. If you moved instantiation of anonymous records to the point where the instances of the class/multiclass that uses them are generated, that would help a lot.​

That's precisely what I've done, if I understand you correctly.

I've added a VarDefInit class which acts as an uninstantiated anonymous record and will "fold" to a DefInit once the template arguments are fully resolved to contain no variable references.

Additional benefits of this are that we don't keep partially resolved anonymous records in the RecordKeeper, and anonymous records that are instantiated with the same template arguments are instantiated only once, so there's less computation and memory use.

    - cleanup !foreach

​+1 for having scoped temporary variable. ATM tablegen can use a class field for that, but using it in the global scope still needs a record for it. Things also go wrong when one accidentally uses a record/field with a known value.​

... and when nesting !foreach with the same iteration variable :wink:

    - add !foldl

    - add !isa<type>(...) and make !cast more useful

​Nice.​

    - add !dag builtin to generate DAG nodes from lists

​Yes, please! Constructing dags using !foreach and !con()​ is a pain.

    - some other minor new built-ins like !size, !le, !lt, !ge, !gt
    - add a defset mechanism for collecting records which can then later be
    looped over in a foreach statement
    - make foreach statements in multiclass work
    - probably more...

​I'll list few things that I have on my list of tablegen annoyances to fix:​

- ​!subst() needs some minor tweaks when it's applied to dag from !foreach -- it wants specific types, but I sometimes want to replace a dag argument (a record) with another dag​.

I haven't touched !subst, but I have a similar issue with !dag.

One of the things I've been thinking about (but haven't actually done) is that we should perhaps make the type lattice complete, by making UnsetInit typed with a new UnsetType, and adding an AnyType.

I ran into this mostly because it would make sense for !dag(op, childnodes, childnodenames) to de facto have the type !dag(any, list<any> or unset, list<string> or unset), or perhaps !dag(anyrecord, list<any> or unset, list<string> or unset).

​- I want some sort of !macro() operator that will never be folded by itself, but will instead literally substitute its value when it's used, so it will have access to the multiclass template arguments, class fields, etc.​ We're currently using anonymous records for similar purposes, but those are instantiated at the wrong time, so it's often impossible to use them as sort of 'subroutines'. Your late generation change you've mentioned above may help, but I'm not sure if that's the use pattern you had in mind.

Actually yes, that's one of the things I rely on quite heavily in my AMDGPU changes. The AMDGPU backend already uses a pattern to implement subroutines that looks like this:

class getWhatever<parameters> {
   type ret = ...;
}

This tends to work out in many cases because getWhatever<...>.ret is substituted early, but there are corner cases when you push it further, and I believe those are all fixed with the VarDefInit change I've described above. In any case, I'm using "subroutines" very heavily :slight_smile: [0]

The other use pattern I had in mind is that I want to concatenate intrinsic type lists that include LLVMMatchType, and so I need to generate LLVMMatchType with calculated reference numbers, and that just doesn't fly today.

    An earlier version of the patches is here if you already want to take a
    look: https://cgit.freedesktop.org/~nh/llvm/log/?h=mimg

    My plan is to clean those up over the next days and weeks and submit
    them to Phabricator.

​I'd be happy to help reviewing them.​

Much appreciated!

    With the exception of !foreach, which doesn't actually seem to be used,

I do have few outstanding changes for NVPTX where I rely on !foreach, but I think it should not be too hard to adapt them.​

That's good to hear :slight_smile:

Thanks,
Nicolai

[0] http://nhaehnle.blogspot.de/2018/02/tablegen-2-functional-programming.html

Hi Nicolai,

Thanks for your work, and I do enjoy reading your blog posts. It would be nice adding links to your post somewhere on existing tablegen documents, or even better, you can improve them (for clarity and readability :p).

Cheers,
chenwj

Hi Chen,

I do hope to get around to some documentation cleanups as well :slight_smile:

Cheers,
Nicolai

Nicolai,

I want to say huge thank you for your improvements to tablegen.
While it’s still far from perfect, I now have a hope that one day
I’ll be able to just write something in tablegen, as opposed to
constantly struggling to trick tablegen into doing what I need it to do.

Thank you.

–Artem

Hi Nicolai,

if you’re open to it, I’d like suggest another smallish cleanup in libTablegen, namely doing something about the layering. Basicly what bit me a few times is that there is a Main.cpp file. This particular file adds CLI options such as -o, -d, -I. So if you build some out-of-tree tool and happen to link against all components because you’re lazy, there is a high likelihood of an option clash, which are nasty to debug. Can we move these options outside of the library and put them in the tool instead?

Cheers,
Philip

Hi Artem,

I want to say huge thank you for your improvements to tablegen.
While it's still far from perfect, I now have a hope that one day
I'll be able to *just write* something in tablegen, as opposed to
constantly struggling to trick tablegen into doing what I need it to do.

And a huge thank you for your quick reviews on what was a *really* long series of patches!

I agree there's still stuff to be done. I will continue to work on some writeups of things I've learned, hopefully leading to some better documentation.

There are also some remaining known quirks, off the top of my head:

1. The weirdness surrounding name resolution, documented in https://reviews.llvm.org/D44478. Unfortunately, fixing this requires major surgery in several backends.

2. We could probably flatten the hierarchy of Inits by getting rid of TypedInits and instead adding an `unset` type and an `any` type.

However, I don't think I'll get to either of those any time soon. If anybody wants to take a stab at the first one, I've pushed a patch here which illustrates the idea: https://cgit.freedesktop.org/~nh/llvm/log/?h=tablegen-name-resolution which contains the lib/TableGen changes.

Cheers,
Nicolai

Hi Philip,

if you're open to it, I'd like suggest another smallish cleanup in libTablegen, namely doing something about the layering. Basicly what bit me a few times is that there is a `Main.cpp` file. This particular file adds CLI options such as -o, -d, -I. So if you build some out-of-tree tool and happen to link against `all` components because you're lazy, there is a high likelihood of an option clash, which are nasty to debug. Can we move these options outside of the library and put them in the tool instead?

No. The reason for those options being there is that any TableGen frontend will want them. In particular, clang-tblgen uses this functionality.

Cheers,
Nicolai