Complex load patterns and token factors

Working on a target I added this pattern:

def : Pat<(v4i64 (load xoaddr:$src)),
          (QVFCTIDb (QVLFDXb xoaddr:$src))>;

which represents an actual load followed by a necessary conversion
operation. The problem is that when this matches any TokenFactor that
was attached to the load node gets attached, not to the inner load
instruction, but the outer conversion operation. This is causing
miscompiles because stores that should be scheduled before the load end
up scheduled after the load (but before the conversion operation,
because the conversion operation inherited the TokenFactor input).

I'd like to fix this so that it works correctly: the TokenFactor
inputs should be attached to all inner-most instructions. I'm guessing
this is somewhere in SelectionDAGISel.cpp, but if someone has a
more-specific idea, I'd appreciate hearing it.

Thanks in advance,
Hal

I believe it's a current TableGen limitation. When generating its DAG
tables for this kind of thing, TableGen gives output instructions that
should take chain a special flag: Opfl_Chain.

Unfortunately the way it decides which instructions are worthy of this
flag is rather naive:
  + If an instruction has a built-in pattern (in the Instruciton
    record), it checks whether that makes use of a chain.
  + Otherwise, the outer instruction of the Pat gets a chain.

So if your QVLFDXb instruction doesn't set the Pattern(s?) field,
TableGen won't know it needs the chain.

There's a big comment just above the test about how the current
situation is rather broken. I'm currently inclined to add a check for
mayLoad and mayStore at that point in TableGeni (see patch), but was
waiting until I could give tests and justification on the list before
submitting it.

Tim.

tblgen-chains.diff (783 Bytes)

> Working on a target I added this pattern:
>
> def : Pat<(v4i64 (load xoaddr:$src)),
> (QVFCTIDb (QVLFDXb xoaddr:$src))>;
>
> I'd like to fix this so that it works correctly: the TokenFactor
> inputs should be attached to all inner-most instructions. I'm
> guessing this is somewhere in SelectionDAGISel.cpp, but if someone
> has a more-specific idea, I'd appreciate hearing it.

I believe it's a current TableGen limitation. When generating its DAG
tables for this kind of thing, TableGen gives output instructions that
should take chain a special flag: Opfl_Chain.

Unfortunately the way it decides which instructions are worthy of this
flag is rather naive:
  + If an instruction has a built-in pattern (in the Instruciton
    record), it checks whether that makes use of a chain.
  + Otherwise, the outer instruction of the Pat gets a chain.

So if your QVLFDXb instruction doesn't set the Pattern(s?) field,
TableGen won't know it needs the chain.

Tim,

Correct, the instruction has no pattern of its own.

There's a big comment just above the test about how the current
situation is rather broken. I'm currently inclined to add a check for
mayLoad and mayStore at that point in TableGeni (see patch), but was
waiting until I could give tests and justification on the list before
submitting it.

Thanks for sending this! As far as a long-term solution, would it be
better to update TableGen with this logic instead of putting this in
ISel? Also, we should probably include hasUnmodeledSideEffects along
with mayLoad and mayStore.

-Hal

> > Working on a target I added this pattern:
> >
> > def : Pat<(v4i64 (load xoaddr:$src)),
> > (QVFCTIDb (QVLFDXb xoaddr:$src))>;
> >
> > I'd like to fix this so that it works correctly: the TokenFactor
> > inputs should be attached to all inner-most instructions. I'm
> > guessing this is somewhere in SelectionDAGISel.cpp, but if someone
> > has a more-specific idea, I'd appreciate hearing it.
>
> I believe it's a current TableGen limitation. When generating its
> DAG tables for this kind of thing, TableGen gives output
> instructions that should take chain a special flag: Opfl_Chain.
>
> Unfortunately the way it decides which instructions are worthy of
> this flag is rather naive:
> + If an instruction has a built-in pattern (in the Instruciton
> record), it checks whether that makes use of a chain.
> + Otherwise, the outer instruction of the Pat gets a chain.
>
> So if your QVLFDXb instruction doesn't set the Pattern(s?) field,
> TableGen won't know it needs the chain.

Tim,

Correct, the instruction has no pattern of its own.

>
> There's a big comment just above the test about how the current
> situation is rather broken. I'm currently inclined to add a check
> for mayLoad and mayStore at that point in TableGeni (see patch),
> but was waiting until I could give tests and justification on the
> list before submitting it.

Thanks for sending this!

Please ignore this statement:

As far as a long-term solution, would it be
better to update TableGen with this logic instead of putting this in
ISel?

-Hal

>
> > > Working on a target I added this pattern:
> > >
> > > def : Pat<(v4i64 (load xoaddr:$src)),
> > > (QVFCTIDb (QVLFDXb xoaddr:$src))>;
> > >
> > > I'd like to fix this so that it works correctly: the TokenFactor
> > > inputs should be attached to all inner-most instructions. I'm
> > > guessing this is somewhere in SelectionDAGISel.cpp, but if
> > > someone has a more-specific idea, I'd appreciate hearing it.
> >
> > I believe it's a current TableGen limitation. When generating its
> > DAG tables for this kind of thing, TableGen gives output
> > instructions that should take chain a special flag: Opfl_Chain.
> >
> > Unfortunately the way it decides which instructions are worthy of
> > this flag is rather naive:
> > + If an instruction has a built-in pattern (in the Instruciton
> > record), it checks whether that makes use of a chain.
> > + Otherwise, the outer instruction of the Pat gets a chain.
> >
> > So if your QVLFDXb instruction doesn't set the Pattern(s?) field,
> > TableGen won't know it needs the chain.
>
> Tim,
>
> Correct, the instruction has no pattern of its own.
>
> >
> > There's a big comment just above the test about how the current
> > situation is rather broken. I'm currently inclined to add a check
> > for mayLoad and mayStore at that point in TableGeni (see patch),
> > but was waiting until I could give tests and justification on the
> > list before submitting it.
>
> Thanks for sending this!

Please ignore this statement:
> As far as a long-term solution, would it be
> better to update TableGen with this logic instead of putting this in
> ISel?

-Hal

> Also, we should probably include hasUnmodeledSideEffects along
> with mayLoad and mayStore.

I also had to include II.canFoldAsLoad to make this work for me. As is
the case with other "simple" loads in the PowerPC backend,
canFoldAsLoad is set but mayLoad is not (is this wrong)?

Thanks again,
Hal

I also had to include II.canFoldAsLoad to make this work for me. As is
the case with other "simple" loads in the PowerPC backend,
canFoldAsLoad is set but mayLoad is not (is this wrong)?

Hmm. So far we've got: mayLoad, mayStore, canFoldAsLoad and
hasUnmodeledSideEffects as candidates.

Looking at Target.td, I see that I missed hasCtrlDep which seems to be
exactly what we're looking for, though it doesn't appear to actually
be used for *anything* at the moment. I'm not sure how I missed it
before.

Unless anyone comes up with a better idea (or just a reason why this
idea is bad) I think I'll send a patch making the predicate just
hasCtrlDep to llvm-commits later, and change my instructions so they
set that flag when needed. It's less friendly than checking all of the
others, but what if someone has a mayLoad instruction they don't want
to chain (for some bizarre reason)?

Tim.

> I also had to include II.canFoldAsLoad to make this work for me. As
> is the case with other "simple" loads in the PowerPC backend,
> canFoldAsLoad is set but mayLoad is not (is this wrong)?

Hmm. So far we've got: mayLoad, mayStore, canFoldAsLoad and
hasUnmodeledSideEffects as candidates.

Looking at Target.td, I see that I missed hasCtrlDep which seems to be
exactly what we're looking for, though it doesn't appear to actually
be used for *anything* at the moment. I'm not sure how I missed it
before.

Unless anyone comes up with a better idea (or just a reason why this
idea is bad) I think I'll send a patch making the predicate just
hasCtrlDep to llvm-commits later, and change my instructions so they
set that flag when needed. It's less friendly than checking all of the
others, but what if someone has a mayLoad instruction they don't want
to chain (for some bizarre reason)?

If it is not being used, maybe we should change it to be
doesNotNeedChain? I am not sure what kind of load does not need to be
chained (prefetch?), but I'd prefer to keep the system friendly if at
all possible. They'll be fewer bugs that way.

Thanks again,
Hal

If it is not being used, maybe we should change it to be
doesNotNeedChain?

That sounded like a really good idea until I started to actually go
through removing all traces of hasCtrlDep from other Targets. I felt
really bad about wiping those flags people had carefully put on their
instructions, even if useless.

I am not sure what kind of load does not need to be chained (prefetch?), but I'd prefer to keep the system friendly if at
all possible. They'll be fewer bugs that way.

That is the main worry. Could we justify a second "hasNoCtrlDep" in
addition to "hasCtrlDep", with it being a TableGen error to set both?
Or just forget about the override until someone actually comes up with
an instruction that's legitimately "mayLoad" or whatever yet doesn't
want a chain (I don't think even prefetch counts -- if other
instructions are modifying that location it needs to know).

Thanks again,

No worries.

Tim.

> If it is not being used, maybe we should change it to be
> doesNotNeedChain?

That sounded like a really good idea until I started to actually go
through removing all traces of hasCtrlDep from other Targets. I felt
really bad about wiping those flags people had carefully put on their
instructions, even if useless.

> I am not sure what kind of load does not need to be chained
> (prefetch?), but I'd prefer to keep the system friendly if at all
> possible. They'll be fewer bugs that way.

That is the main worry. Could we justify a second "hasNoCtrlDep" in
addition to "hasCtrlDep", with it being a TableGen error to set both?
Or just forget about the override until someone actually comes up with
an instruction that's legitimately "mayLoad" or whatever yet doesn't
want a chain

This is probably the best approach. There is no need to invent a
feature with no use cases.

-Hal