choice between SSAPRE and bitvector aporach

Hi LLVMers,
    I am a PHD student in CS dept in UIUC, I am doing a project for
Vikram's course, it is about PRE. I would like to know why you didn't
choose SSAPRE in LLVM, since it seems to be more suitable for LLVM (it
can operate directly on SSA form and avoid the conversion between SSA
and bit-vector). Can anyone tell me the reason?
Xuehai

Hi LLVMers,
   I am a PHD student in CS dept in UIUC, I am doing a project for
Vikram's course, it is about PRE. I would like to know why you didn't
choose SSAPRE in LLVM, since it seems to be more suitable for LLVM (it
can operate directly on SSA form and avoid the conversion between SSA
and bit-vector). Can anyone tell me the reason?

Hi Xuehai,

If I remember correctly, there were several details that the paper assumed that made adapting it to work in LLVM very difficult.

-bw

GVNPRE (which LLVM uses, see lib/Scalar/GVNPRE.cpp) is value based,
not lexical identity based like SSAPRE. As such it has different
requirements.
In addition, it operates on SSA form as much as SSAPRE does

SSAPRE builds an expression PRE form, known as EPRE, sparsely, for
each variable.
This includes EPHI placement and ESSA renaming.

GVNPRE does not need to build an expression SSA form to do it's work
(It lets value numbering take care of determining the necessary
qualities)

Both take advantage of SSA, neither is really directly working on SSA form.

There is no conversion between bit vector and SSA in either one.
In the case of SSAPRE, there is a conversion from EPRE back into regular SSA.
In the case of GVNPRE, there is no conversion.

It is possible to make a sparse version of GVNPRE (IE compute it
per-variable like SSAPRE does). I have one completed.
It is neither faster nor more memory efficient than the bitvector approach.

I strongly suggest that you reread the GVNPRE and SSAPRE papers.
The question you are asking implies you believe somehow SSAPRE is a
"true ssa" analysis and GVNPRE is not.
The reality is they both require work on top of SSA.

i"m not sure why you would think it is more suitable.

> Hi LLVMers,
> I am a PHD student in CS dept in UIUC, I am doing a project for
> Vikram's course, it is about PRE. I would like to know why you didn't
> choose SSAPRE in LLVM, since it seems to be more suitable for LLVM (it
> can operate directly on SSA form and avoid the conversion between SSA
> and bit-vector). Can anyone tell me the reason?

Hi Xuehai,
>

If I remember correctly, there were several details that the paper
assumed that made adapting it to work in LLVM very difficult.

It would at least require side-data structures to store the
occurrences and expression phis.

Dan,

Doesn't the paper also assume the invariant that phi operands are effectively dead after the Phi, which is true right after SSA is constructed, but potentially not after transformations?

--Vikram

Yes, I think that that was the major problem with it.

-bw

I do not recall this being *required*. It certainly will miss
optimizations if you have a pruned form (there are still testcases in
gcc bugzilla i believe), but i do not believe it will generate
incorrect code.
I could be misremembering.
The main annoying invariant i recall is that it requires SSA live
ranges of original program variables do not overlap in order for ESSA
renaming to come out correct. Even simple optimizations on renaming
forms will generate destroy this invariant.
It certainly can be made true as a pre-pass through further renaming.
It is also trivially true of all non-renaming FUD chain variants of SSA

--Dan

Dan,

Doesn't the paper also assume the invariant that phi operands are
effectively dead after the Phi, which is true right after SSA is
constructed, but potentially not after transformations?

I do not recall this being *required*. It certainly will miss
optimizations if you have a pruned form (there are still testcases in
gcc bugzilla i believe), but i do not believe it will generate
incorrect code.
I could be misremembering.
The main annoying invariant i recall is that it requires SSA live
ranges of original program variables do not overlap in order for ESSA
renaming to come out correct. Even simple optimizations on renaming
forms will generate destroy this invariant.

Actually, what I was describing was equivalent to this.

It certainly can be made true as a pre-pass through further renaming.

Yes, but that would just undo the benefits of some of the optimizations that can make such live ranges overlap, e.g., GVN. So it seems undesirable to do this.

It is also trivially true of all non-renaming FUD chain variants of SSA

I'm not sure what you mean by 'FUD' here. In any case, I think the problem is non-trivial.

--Vikram
http://www.cs.uiuc.edu/~vadve
http://llvm.org/

>>
>>

>> Dan,
>>
>> Doesn't the paper also assume the invariant that phi operands are
>> effectively dead after the Phi, which is true right after SSA is
>> constructed, but potentially not after transformations?
> I do not recall this being *required*. It certainly will miss
> optimizations if you have a pruned form (there are still testcases in
> gcc bugzilla i believe), but i do not believe it will generate
> incorrect code.
> I could be misremembering.
> The main annoying invariant i recall is that it requires SSA live
> ranges of original program variables do not overlap in order for ESSA
> renaming to come out correct. Even simple optimizations on renaming
> forms will generate destroy this invariant.

Actually, what I was describing was equivalent to this.
> It certainly can be made true as a pre-pass through further renaming.

Yes, but that would just undo the benefits of some of the
optimizations that can make such live ranges overlap, e.g., GVN. So
it seems undesirable to do this.

yes

> It is also trivially true of all non-renaming FUD chain variants of
> SSA

I'm not sure what you mean by 'FUD' here. In any case, I think the
problem is non-trivial.

Factored Use-Def.

See, e.g., Extended SSA with factored use-def chains to support optimization and parallelism | IEEE Conference Publication | IEEE Xplore

Because of how they work (phi nodes but no renaming of variables) they
do not allow overlapping live ranges.
Most also avoid the need to do out-of-ssa, they can just drop their
chains and phi nodes.