adding switches to llvm-ld to disable certain optimizations.

Hi,
I need to add switches like -disable-mem2reg, disable-gvn to llvm-ld. Currently CreateStandardLTOPasses takes in only DisableInternalize and DisableInliner switches.

Is modifying this API okay for these new switches, or is it doable in some other ways ?

- Sanjiv

Why do you want this?

-Chris

Hi,
I need to add switches like -disable-mem2reg, disable-gvn to llvm-ld.
Currently CreateStandardLTOPasses takes in only DisableInternalize and
DisableInliner switches.

Is modifying this API okay for these new switches, or is it doable in
some other ways ?

I don't think this is appropriate for the standard passes API. If you
really care about tweaking the pass list, you should be making your
own (based on the standard pass list perhaps).

- Daniel

Chris Lattner wrote:

Have you ever investigated the following approach? Define fake
register+register forms of common instructions, in addition to the
register+memory forms. Let the instruction selector work as if
everything were in registers. Then, since there's only one physical
register, the register allocator will have to spill, and the spills
and reloads can be folded in, eliminating the take register+register
forms. You might need special handling for the case where both
operands are the same.

If this works well enough, it would allow your target to be less
strange from LLVM's perspective. Fewer things would need to be
Custom-expanded (e.g. ADD), and it may even allow you to actually
run more of the optimizer (since without mem2reg, much of the
optimizer is effectively disabled).

Dan

Dan Gohman wrote:

Have you ever investigated the following approach? Define fake
register+register forms of common instructions, in addition to the
register+memory forms. Let the instruction selector work as if
everything were in registers. Then, since there's only one physical
register, the register allocator will have to spill, and the spills
and reloads can be folded in, eliminating the take register+register
forms. You might need special handling for the case where both
operands are the same.

If this works well enough, it would allow your target to be less
strange from LLVM's perspective. Fewer things would need to be
Custom-expanded (e.g. ADD), and it may even allow you to actually
run more of the optimizer (since without mem2reg, much of the
optimizer is effectively disabled).

Dan

I remember that you had suggested this in one of earlier emails as well, which I lost. And I was desperately searching for that email. Glad that you put up it again.
The approach actually sounds better as it will drastically simplify the back-end code. But I was clueless as to how to make register allocator fold the spills and reloads into the actual target instructions. The only interfaces that it exposes are saveRegToStackSlot and loadRegFromStackSlot, and we didn't even know for which instructions these spills are reloads are happening. All these APIs get is a frameIndex.
Now that you have decided to get us to explore a better path, it would be good if you could put more light to these issues.

One more thing that I feel will simplify things in a great sense is to make i16 legal (as it would make the pointer legal) and there onwards lower the types/operations ourselves to 8-bit (as type legalizer wouldn't do that). By doing that we would pretty much need to duplicate the legalizer code in our back-end as the TypeLegalizer interfaces currently are not exposed to TargetLowering. Or can a back-end just create an instance of Type Legalizer and use it?

Thanks,
- Sanjiv

Dan Gohman wrote:

Have you ever investigated the following approach? Define fake
register+register forms of common instructions, in addition to the
register+memory forms. Let the instruction selector work as if
everything were in registers. Then, since there's only one physical
register, the register allocator will have to spill, and the spills
and reloads can be folded in, eliminating the take register+register
forms. You might need special handling for the case where both
operands are the same.

If this works well enough, it would allow your target to be less
strange from LLVM's perspective. Fewer things would need to be
Custom-expanded (e.g. ADD), and it may even allow you to actually
run more of the optimizer (since without mem2reg, much of the
optimizer is effectively disabled).

Dan

I remember that you had suggested this in one of earlier emails as well, which I lost. And I was desperately searching for that email. Glad that you put up it again.
The approach actually sounds better as it will drastically simplify the back-end code. But I was clueless as to how to make register allocator fold the spills and reloads into the actual target instructions. The only interfaces that it exposes are saveRegToStackSlot and loadRegFromStackSlot, and we didn't even know for which instructions these spills are reloads are happening. All these APIs get is a frameIndex.
Now that you have decided to get us to explore a better path, it would be good if you could put more light to these issues.

The main API hooks here are TargetInstrInfo::foldMemoryOperandImpl; there's
a FrameIndex form and a generic load form.

To be sure, I don't know if this kind of approach will work well. But if it
does, it could help make PIC16 less different from other targets in LLVM.

One more thing that I feel will simplify things in a great sense is to make i16 legal (as it would make the pointer legal) and there onwards lower the types/operations ourselves to 8-bit (as type legalizer wouldn't do that). By doing that we would pretty much need to duplicate the legalizer code in our back-end as the TypeLegalizer interfaces currently are not exposed to TargetLowering. Or can a back-end just create an instance of Type Legalizer and use it?

I don't have anything to suggest here.

Dan

Rather than adding flags, why not add your own custom routine that populate the pass manager. You don't have to use CreateStandardLTOPasses, nor do you have to use the standard llvm-ld.

Evan

One more thing that I feel will simplify things in a great sense is to make i16 legal (as it would make the pointer legal) and there onwards lower the types/operations ourselves to 8-bit (as type legalizer wouldn’t do that). By doing that we would pretty much need to duplicate the legalizer code in our back-end as the TypeLegalizer interfaces currently are not exposed to TargetLowering. Or can a back-end just create an instance of Type Legalizer and use it?

I don’t have anything to suggest here.

Dan

Duncan,
Your two cents needed here.

  • Sanjiv

I don't think our problem is in the way that we define our instructions
nor even will it be resolved by removing Mem2Reg optimizations. As Dan
says, Mem2Reg is the prerequisite for so many other optimizations that
we can't afford to loose it; in fact removing Mem2Reg helps in some
cases, but in few cases even increases the code size.
I think the answer is in the scheduler. Currently the LLVM scheduler
tries to reduce the register pressure on the aggregate of operations in
one basic block and leaves the rest to the register allocator to do it
magic (at least that is how I understand it); however, for an 8-bit
device with only one register, there isn't much that the register
allocator can do, hence increasing the number of spills.
What I think we should do is to add a new scheduling mode where the
scheduler tries to keep all operations on one dataflow path together;
kind of like what one would do for a stack based machine.
Now this stack-based scheduler mode is what I've been thinking of
adding, but I need more clues into the how-to of it and what it will
affect as far as other pieces of LLVM. Any kind of input with this
regard is appreciated.

Thanks
A.

Dan Gohman wrote:

Have you ever investigated the following approach? Define fake
register+register forms of common instructions, in addition to the
register+memory forms. Let the instruction selector work as if
everything were in registers. Then, since there's only one physical
register, the register allocator will have to spill, and the spills
and reloads can be folded in, eliminating the take register+register
forms. You might need special handling for the case where both
operands are the same.

If this works well enough, it would allow your target to be less
strange from LLVM's perspective. Fewer things would need to be
Custom-expanded (e.g. ADD), and it may even allow you to actually
run more of the optimizer (since without mem2reg, much of the
optimizer is effectively disabled).

Dan

I remember that you had suggested this in one of earlier emails as

well, which I lost. And I was desperately searching for that email. Glad
that you put up it again.

The approach actually sounds better as it will drastically simplify

the back-end code. But I was clueless as to how to make register
allocator fold the spills and reloads into the actual target
instructions. The only interfaces that it exposes are saveRegToStackSlot
and loadRegFromStackSlot, and we didn't even know for which instructions
these spills are reloads are happening. All these APIs get is a
frameIndex.

Now that you have decided to get us to explore a better path, it would

be good if you could put more light to these issues.

The main API hooks here are TargetInstrInfo::foldMemoryOperandImpl;
there's
a FrameIndex form and a generic load form.

To be sure, I don't know if this kind of approach will work well. But if
it
does, it could help make PIC16 less different from other targets in
LLVM.

One more thing that I feel will simplify things in a great sense is to

make i16 legal (as it would make the pointer legal) and there onwards
lower the types/operations ourselves to 8-bit (as type legalizer
wouldn't do that). By doing that we would pretty much need to duplicate
the legalizer code in our back-end as the TypeLegalizer interfaces
currently are not exposed to TargetLowering. Or can a back-end just
create an instance of Type Legalizer and use it?

I don't have anything to suggest here.

Dan