Manipulating DAGs in TableGen

This is a proposal to enhance TableGen's ability to analyze and manipulate
DAGs. Hopefully this will allows more complex DAGs to be built in TableGen.

1. Add a new value suffix.

   value(index)

   The value must be a DAG. The index specifies the operator or an operand,
   whose value is produced. The index can be

   0 produce the operator
   1...n produce operand by position
   $name produce operator/operand by its variable name
   string produce operator/operand by a string containing its
                variable name

   If the item does not exist, ? (uninitialized) is produced.

   Note that multiple value suffixes are allowed, so, for example,
   DagList[i](1) would produce the first operand of the i-th dag
   in the list.

2. Add the !getdag() bang operator.

   !getdag(dag, index [, default])

   This bang operator produces the same result as the (...) suffix.
   However, the default value can be specified as the third argument.
   If it is not specified, ? is used.

3. Add the !setdag bang operator.

   !setdag(dag, index1, value1, index2, value2, ...)

   This bang operator creates a copy of the top-level dag node and
   then replaces the operator and/or operands with new values. Each
   replacement is specified by an index and a value. The new dag
   is produced.

The two new bang operators replace !getop() and !setop(), which could be
deprecated.

4. The !size operator will be extended to accept a DAG and produce
   the number of operands in it.

Hi Paul,

This is a proposal to enhance TableGen's ability to analyze and manipulate
DAGs. Hopefully this will allows more complex DAGs to be built in TableGen.

1. Add a new value suffix.

   value(index)

   The value must be a DAG. The index specifies the operator or an operand,
   whose value is produced. The index can be

   0 produce the operator
   1...n produce operand by position

This seems reasonable.

   $name produce operator/operand by its variable name
   string produce operator/operand by a string containing its
                variable name

I don't like this part, because to me is seems to conflate what the
indices vs. name operands in DAGs are for. We can often think of DAG
operators as functions, and the operands are arguments of the
function. So using a numeric index would return an argument of a given
index. However, the $names are _not_ names of function arguments! They
are a mechanism for tagging DAG nodes that are interesting as part of
pattern matching.

So please don't add this functionality, and definitely don't add it in
this way. If there is a convincing reason for extracting DAG nodes by
name, then it should be done via a different ! accessor that performs
a deep search of the DAG (i.e., it can produce DAG nodes inside
arbitrarily deeply nested children).

   If the item does not exist, ? (uninitialized) is produced.

I think it would be better for this to fail instead (i.e., not fold
and produce an error at final resolution). Especially since you
propose !getdag below.

   Note that multiple value suffixes are allowed, so, for example,
   DagList[i](1) would produce the first operand of the i-th dag
   in the list.

2. Add the !getdag() bang operator.

   !getdag(dag, index [, default])

   This bang operator produces the same result as the (...) suffix.
   However, the default value can be specified as the third argument.
   If it is not specified, ? is used.

As above, I would suggest having to specify ? explicitly.

3. Add the !setdag bang operator.

   !setdag(dag, index1, value1, index2, value2, ...)

   This bang operator creates a copy of the top-level dag node and
   then replaces the operator and/or operands with new values. Each
   replacement is specified by an index and a value. The new dag
   is produced.

The two new bang operators replace !getop() and !setop(), which could be
deprecated.

4. The !size operator will be extended to accept a DAG and produce
   the number of operands in it.

Sounds good.

Cheers,
Nicolai

I included the ability to get/set an operand by name because I thought it would be easier to copy+modify an existing DAG by specifying the name of the operand you want to replace rather than having to remember its position. For example, if you want to replace the first source, isn't it easier to specify $src than remember it's the second operand?

Perhaps the people actually coding these DAGs have it all down in their minds by position anyway.

I included the ability to get/set an operand by name because I thought it would be easier to copy+modify an existing DAG by specifying the name of the operand you want to replace rather than having to remember its position. For example, if you want to replace the first source, isn't it easier to specify $src than remember it's the second operand?

My point is precisely that the $names don't work that way. Your
reasoning would be valid if the $names were function/operator argument
names, like in programming languages where you can pass function
arguments based on their order but also by naming them (e.g.
"functionName(argName=x, otherArgName=y)"). However, this is _not_ how
$names work!

Their most prominent application is for instruction selection pattern
matching, e.g. taken at random from AMDGPU/SOPInstructions.td:

def : GCNPat <
  (i32 (smax i32:$x, (i32 (ineg i32:$x)))),
  (S_ABS_I32 SReg_32:$x)

;

The $x is _not_ the name of the argument to smax, ineg, or S_ABS_I32.
For example, if you look at how S_ABS_I32 is defined, you'll see that
its input operand is called $src0.

Instead, the name allows us to tie three locations in the DAG together
for purposes of pattern matching. The name is only meaningful in the
context of this pattern. You could substitute $x by $y or $whatever
without changing the meaning of the DAG.

That the name is the name of an operator argument is an understandable
misunderstanding, but it _is_ a misunderstanding. If you were to add
that particular feature, you would encourage this misunderstanding
even more.

Cheers,
Nicolai

I understood that the name is a matching tag for the operand and not its name (as in named macro or function arguments). However, I was assuming that the names in any one DAG node had to be unique and so could serve as selectors for operands. But a quick investigation shows that I was wrong: names can be duplicated in the same node.

So DAG indexes are integers only.

What do you guys think about the below enhancements?

  1. !getdagrestype(dag [, index]) - Returns type of result value. If the DAG computes multiple values then return type of 'index’th result.

  2. !setdagrestype(dag target_dag, type T [, index]) - Set return type of target_dag to T. Use of ‘index’ is as in 5.(Coupled with the existing (or enhanced?) foreach construct we can construct multiple DAGs with different return types.)

.7 !setdagchild(dag target_dag, dag new_dag, index) - Set child ‘index’ numbered of target_dag to new_dag. I think this is more or less similar to 3 you suggested but I feel it is more convenient and concise.

  1. !setdagchildcond(dag target_dag, dag new_dag, index, {C++ code}) - Similar to 7 above but do it only if the C++ code returns true. This is useful to check if the result type of new_dag and that of the operand type of ‘index’ child of ‘target_dag’ are compatible. Users can define compatibility using C++ code. For example, it is okay to set dag even if there is mismatch between signedness of types.

I'm not sure what you mean by "type of result value." Can you explain further?

What do you guys think about the below enhancements?

5. !getdagrestype(dag [, index]) - Returns type of result value. If the DAG computes multiple values then return type of 'index'th result.

6. !setdagrestype(dag target_dag, type T [, index]) - Set return type of target_dag to T. Use of 'index' is as in 5.(Coupled with the existing (or enhanced?) foreach construct we can construct multiple DAGs with different return types.)

.7 !setdagchild(dag target_dag, dag new_dag, index) - Set child 'index' numbered of target_dag to new_dag. I think this is more or less similar to 3 you suggested but I feel it is more convenient and concise.

8. !setdagchildcond(dag target_dag, dag new_dag, index, {C++ code}) - Similar to 7 above but do it only if the C++ code returns true. This is useful to check if the result type of `new_dag` and that of the operand type of 'index' child of 'target_dag' are compatible. Users can define compatibility using C++ code. For example, it is okay to set dag even if there is mismatch between signedness of types.

All of these sound like operations that are specific to TableGen
backend interpretations of what a DAG means. This discussion is about
!ops which are a part of the TableGen frontend, so I don't think any
of these apply here.

Cheers,
Nicolai

Nicolai:

If we have two operators to get and set DAG operator/operands, does it make sense to add to more operators to get/set the $names of operands? They would still specify the operand by integer index.

Nicolai:

If we have two operators to get and set DAG operator/operands, does it make sense to add to more operators to get/set the $names of operands? They would still specify the operand by integer index.

What do you guys think about the below enhancements?

  1. !getdagrestype(dag [, index]) - Returns type of result value. If the DAG computes multiple values then return type of 'index’th result.

  2. !setdagrestype(dag target_dag, type T [, index]) - Set return type of target_dag to T. Use of ‘index’ is as in 5.(Coupled with the existing (or enhanced?) foreach construct we can construct multiple DAGs with different return types.)

.7 !setdagchild(dag target_dag, dag new_dag, index) - Set child ‘index’ numbered of target_dag to new_dag. I think this is more or less similar to 3 you suggested but I feel it is more convenient and concise.

  1. !setdagchildcond(dag target_dag, dag new_dag, index, {C++ code}) - Similar to 7 above but do it only if the C++ code returns true. This is useful to check if the result type of new_dag and that of the operand type of ‘index’ child of ‘target_dag’ are compatible. Users can define compatibility using C++ code. For example, it is okay to set dag even if there is mismatch between signedness of types.

All of these sound like operations that are specific to TableGen
backend interpretations of what a DAG means. This discussion is about
!ops which are a part of the TableGen frontend, so I don’t think any
of these apply here.

I am not sure why you say so. Isn’t 7 and 8 above somewhere similar to !con already offered by the language? !con allows you to concatenate two DAGs, 5 allows you to connect two DAGs to form a bigger DAG. You may be able to achieve the same today with existing the language constructs but I don’t see a concise way to do this.

5 and 6 are subject to debate but since it’s a language an addition like this could be useful.

Madhur, could you describe !getdagrestype in more detail? In particular, I don't know what "type of result value" means.

Madhur, could you describe !getdagrestype in more detail? In particular, I don’t know what “type of result value” means.

I mean the data type of the result computed by a DAG node. A DAG node may receive a bunch of values from others nodes, applies its operator and returns an ‘int’, so its restype is ‘int’, for example.

Does that make sense?

I don't think it makes sense. The intepretation of DAG operators and operands is entirely up to the backend(s) analyzing those records. The operators have no meaning to TableGen.

Oh yeah, I got that confused with types provided by TableGen. I take that back.

Nicolai:

If we have two operators to get and set DAG operator/operands, does it make sense to add to more operators to get/set the $names of operands? They would still specify the operand by integer index.

>> What do you guys think about the below enhancements?
>>
>> 5. !getdagrestype(dag [, index]) - Returns type of result value. If the DAG computes multiple values then return type of 'index'th result.
>>
>> 6. !setdagrestype(dag target_dag, type T [, index]) - Set return type of target_dag to T. Use of 'index' is as in 5.(Coupled with the existing (or enhanced?) foreach construct we can construct multiple DAGs with different return types.)
>>
>> .7 !setdagchild(dag target_dag, dag new_dag, index) - Set child 'index' numbered of target_dag to new_dag. I think this is more or less similar to 3 you suggested but I feel it is more convenient and concise.
>>
>> 8. !setdagchildcond(dag target_dag, dag new_dag, index, {C++ code}) - Similar to 7 above but do it only if the C++ code returns true. This is useful to check if the result type of `new_dag` and that of the operand type of 'index' child of 'target_dag' are compatible. Users can define compatibility using C++ code. For example, it is okay to set dag even if there is mismatch between signedness of types.
>
>All of these sound like operations that are specific to TableGen
>backend interpretations of what a DAG means. This discussion is about
>!ops which are a part of the TableGen frontend, so I don't think any
>of these apply here.

I am not sure why you say so. Isn't 7 and 8 above somewhere similar to !con already offered by the language? !con allows you to concatenate two DAGs, 5 allows you to connect two DAGs to form a bigger DAG. You may be able to achieve the same today with existing the language constructs but I don't see a concise way to do this.

You're right that 7 is like !cond (and actually seems like it's
identical to the !setdag proposed by Paul).

8 doesn't work though because you can't just run C++ code at TableGen
frontend runtime.

Cheers,
Nicolai

Nicolai:

If we have two operators to get and set DAG operator/operands, does it make sense to add to more operators to get/set the $names of operands? They would still specify the operand by integer index.

Yes, I do think that would be helpful. For setting, I think you'd want
to set the child and name at the same time, i.e. something like
!setdag(dag_to_modify, index1, value1, name1, index2, value2, name2,
...). Either valueN or nameN can be `?` (unset).

Cheers,
Nicolai

Beautiful.

Should there be !getdagvalue and !getdagname, or should !getdag produce a pair?

Beautiful.

Should there be !getdagvalue and !getdagname, or should !getdag produce a pair?

I don't have a strong opinion, but would lean slightly towards
!getdagvalue / !getdagname because of how things tend to fit together
in TableGen in general.

Cheers,
Nicolai