[RFC] NewGVN

Hi,
we would like to propose a new Global Value Numbering pass in LLVM.
The ideas/code are from Daniel Berlin (with a minor overhaul/splitting
into submittable patches from me). The code has been around for a
while (2012 or before), and we think it's getting ready to be
committed upstream.

### Motivation

To put things into context: my personal motivation for having a new
GVN/PRE algorithm is LTO.
It's not a secret that LLVM is getting slower and slower release after
release, as Rafael discovered/pointed out in March [1] (and probably
many others found out). I personally took a shot at profiling LTO on
many internal/opensource applications (including clang itself) and
noticed that GVN always show in the top-3 passes (and it's generally
the pass where we spend most of the time in the middle-end). There are
cases (extreme) where 90% of the compile time goes in GVN.

Example:

First, thanks. This is a very very long time coming :slight_smile:
Second, for those watching, note that pretty much all of the improvements and missing cases, including load forwarding/coercion, etc, actually are done.
It’s more a matter of cleaning them up, breaking it down, and submitting it, than “implementing them”.
The main experimentation at this point is “can we do it cleaner” not “can we do it” :slight_smile:

It’s also important to note that this new GVN also treats loads/stores in a unified way with scalars, unlike current GVN (which has no load or store value numbering).

So it will happily discover complex load/store relations (though there is some improvements we can still make here)
For example:

int vnum_test8(int *data)
{
int i;
int stop = data[3];
int m = data[4];
int n = m;
int p;
for (i=0; i<stop; i++) {
int k = data[2];
data[k] = 2;
data[0] = m - n;
k = data[1];
m = m + k;
n = n + k;
p = data[0];
}
return p;
}

LLVM’s current GVN will eliminate a single load here[1]
NewGVN will calculate that m and n are equal, that m-n is 0, that p is 0

It’s not quite perfect yet, i haven’t fixed store handling, so the following is missed:

int a;
int *p;
// LLVM is too smart and if we don’t do this, realizes *p is a store to undef
void foo(){
p = &a;
}
int main(int argc, char **argv) {
int result;
foo();
*p = 2;
if (argc)
*p = 2;
result = *p;
return result;
}

Here, current LLVM GVN will do nothing, because it can’t understand anything really about the stores.
GCC’s GVN will determine result is 2.
NewGVN is not quite that smart yet (it require a little work to what we do to stores, and value numbering memory ssa versions)

This issue compounds if you have conditional stores of the same value.

So, for example, if you add:

if (i < 30)
data[0] = 0;

to the first case.

GCC can still determine p is 0.

Currently, NewGVN cannot.

–Dan

First, thanks. This is a very very long time coming :slight_smile:
Second, for those watching, note that pretty much all of the improvements
and missing cases, including load forwarding/coercion, etc, actually are
done.
It's more a matter of cleaning them up, breaking it down, and submitting it,
than "implementing them".

Oh sure, bad wording. For those interested in the other pieces, this
is Dan's branch
https://github.com/dberlin/llvm-gvn-rewrite

This is really great to see, as I’ve spent far too much of my life over the past two years fighting with undocumented assumptions made by GVN. A couple of quick questions about the new GVN, based on problems I’ve had with the old one:

Does it assume that it’s always safe to widen a load (or store) to a power of two? For our target, this is only sound if you can show that the pointer was used to read all of the bytes that you are loading (we have byte-granularity memory safety). Old GVN has no hooks for targets to specify whether this is safe and so is implicitly assuming a page-based MMU. This optimisation is also unsound for M-profile ARM cores, though will fail occasionally there, whereas it fails deterministically for us.

Does it make any assumptions about the layout of memory in pointers? Old GVN treats pointers as integers and assumes that it’s safe to do partial stores to them. As a prerequisite for memory safety, we must be able to guarantee atomic updates to pointers and we had to hack GVN to disable a bunch of these things. In LLVM IR, pointers are opaque and there is no guarantee that their representation is the same as a same-sized integer, but old GVN makes this assumption.

David

This is really great to see, as I’ve spent far too much of my life over
the past two years fighting with undocumented assumptions made by GVN. A
couple of quick questions about the new GVN, based on problems I’ve had
with the old one:

Does it assume that it’s always safe to widen a load (or store) to a power
of two?

I don't believe old gvn does widening any more, and new gvn certainly
doesn't.

For our target, this is only sound if you can show that the pointer was
used to read all of the bytes that you are loading (we have
byte-granularity memory safety). Old GVN has no hooks for targets to
specify whether this is safe and so is implicitly assuming a page-based
MMU. This optimisation is also unsound for M-profile ARM cores, though
will fail occasionally there, whereas it fails deterministically for us.

Does it make any assumptions about the layout of memory in pointers?

Not that i know of

Old GVN treats pointers as integers and assumes that it’s safe to do
partial stores to them.

Can you give an example?
Is this store coercion or something?

This is really great to see, as I’ve spent far too much of my life over
the past two years fighting with undocumented assumptions made by GVN. A
couple of quick questions about the new GVN, based on problems I’ve had with
the old one:

Does it assume that it’s always safe to widen a load (or store) to a power
of two?

I don't believe old gvn does widening any more, and new gvn certainly
doesn't.

Yes, as it apparently blocks other optimizations. David, see
⚙ D24096 Do not widen load for different variable in GVN. for details.
As an aside, if you're interested in combining loads, you may want to
take a look at Michael Spencer's loadcombine pass.

For our target, this is only sound if you can show that the pointer was
used to read all of the bytes that you are loading (we have byte-granularity
memory safety). Old GVN has no hooks for targets to specify whether this is
safe and so is implicitly assuming a page-based MMU. This optimisation is
also unsound for M-profile ARM cores, though will fail occasionally there,
whereas it fails deterministically for us.

Does it make any assumptions about the layout of memory in pointers?

Which assumptions are you thinking of?

My last merge from upstream was about a year ago (and a new one is long overdue), but there were issues where GVN was assuming that if it did a load of a pointer then a ptrtoint, then a truncation, that it would get the same result as doing a narrower load. This is not the case in any platform where pointers are not simply integers (i.e. where you actually need inttoptr / ptrtoint instead of bitcast).

David

>
>>>
>>> For our target, this is only sound if you can show that the pointer was
>>> used to read all of the bytes that you are loading (we have
byte-granularity
>>> memory safety). Old GVN has no hooks for targets to specify whether
this is
>>> safe and so is implicitly assuming a page-based MMU. This
optimisation is
>>> also unsound for M-profile ARM cores, though will fail occasionally
there,
>>> whereas it fails deterministically for us.
>>>
>>> Does it make any assumptions about the layout of memory in pointers?
>>
>
> Which assumptions are you thinking of?

My last merge from upstream was about a year ago (and a new one is long
overdue), but there were issues where GVN was assuming that if it did a
load of a pointer then a ptrtoint, then a truncation, that it would get the
same result as doing a narrower load. This is not the case in any platform
where pointers are not simply integers (i.e. where you actually need
inttoptr / ptrtoint instead of bitcast).

You keep talking about platforms, but llvm ir itself is not platform
dependent.
Can you give a reference in the language reference that says that this is
not legal?

IE what loads do *on your platform* is completely irrelevant to whether the
IR code is legal or not, only what it codegens to.

LLVM's type semantics (and pointers may not have types, but the load
operations produce values that do) are also not defined in terms of
platform, but in terms of what datalayout says, etc.

What you want seems to be non-integral pointer types.

Which are experimental:

"LLVM IR optionally allows the frontend to denote pointers in certain
address spaces as “non-integral” via the datalayout string
<http://llvm.org/docs/LangRef.html#langref-datalayout&gt;\. Non-integral
pointer types represent pointers that have an *unspecified* bitwise
representation; that is, the integral representation may be target
dependent or unstable (not backed by a fixed integer).

inttoptr instructions converting integers to non-integral pointer types are
ill-typed, and so are ptrtoint instructions converting values of
non-integral pointer types to integers. Vector versions of said
instructions are ill-typed as well."

One of the reasons it's experimental is because nobody has made it work in
all cases.

I think whoever wants this to work is going to have to drive fixing it and
making it work sanely.

Hi all,

    My last merge from upstream was about a year ago (and a new one is
    long overdue), but there were issues where GVN was assuming that if
    it did a load of a pointer then a ptrtoint, then a truncation, that
    it would get the same result as doing a narrower load. This is not
    the case in any platform where pointers are not simply integers
    (i.e. where you actually need inttoptr / ptrtoint instead of bitcast).

You keep talking about platforms, but llvm ir itself is not platform
dependent.
Can you give a reference in the language reference that says that this
is not legal?

IE what loads do *on your platform* is completely irrelevant to whether
the IR code is legal or not, only what it codegens to.

LLVM's type semantics (and pointers may not have types, but the load
operations produce values that do) are also not defined in terms of
platform, but in terms of what datalayout says, etc.

What you want seems to be non-integral pointer types.

Which are experimental:

"LLVM IR optionally allows the frontend to denote pointers in certain
address spaces as “non-integral” via the datalayout string
<http://llvm.org/docs/LangRef.html#langref-datalayout&gt;\. Non-integral
pointer types represent pointers that have an /unspecified/ bitwise
representation; that is, the integral representation may be target
dependent or unstable (not backed by a fixed integer).

>inttoptr> instructions converting integers to non-integral pointer
types are ill-typed, and so are |ptrtoint| instructions converting
values of non-integral pointer types to integers. Vector versions of
said instructions are ill-typed as well."

One of the reasons it's experimental is because nobody has made it work
in all cases.

I think whoever wants this to work is going to have to drive fixing it
and making it work sanely.

Hopefully this won't derail this thread -- but I plan to resume work on non-integral pointers very soon (mid December - early Jan). Right now I'm busy with some higher priority things.

We have the same problem as David C., btw, that GVN tends to freely convert between pointers and integers. We have local patches that fix old GVN to DTRT, and the my plan was to upstream the custom patches predicated on the pointer types. Same with instcombine (I don't remember if we have other patches).

I'm fine re-doing the same work on NewGVN (prevent inttoptr / ptrtoint on certain class of pointers).

-- Sanjoy

Hi all,

    My last merge from upstream was about a year ago (and a new one is
    long overdue), but there were issues where GVN was assuming that if
    it did a load of a pointer then a ptrtoint, then a truncation, that
    it would get the same result as doing a narrower load. This is not
    the case in any platform where pointers are not simply integers
    (i.e. where you actually need inttoptr / ptrtoint instead of bitcast).

You keep talking about platforms, but llvm ir itself is not platform
dependent.
Can you give a reference in the language reference that says that this
is not legal?

IE what loads do *on your platform* is completely irrelevant to whether
the IR code is legal or not, only what it codegens to.

LLVM's type semantics (and pointers may not have types, but the load
operations produce values that do) are also not defined in terms of
platform, but in terms of what datalayout says, etc.

What you want seems to be non-integral pointer types.

Which are experimental:

"LLVM IR optionally allows the frontend to denote pointers in certain
address spaces as “non-integral” via the datalayout string
<http://llvm.org/docs/LangRef.html#langref-datalayout&gt;\. Non-integral
pointer types represent pointers that have an /unspecified/ bitwise
representation; that is, the integral representation may be target
dependent or unstable (not backed by a fixed integer).

>inttoptr> instructions converting integers to non-integral pointer
types are ill-typed, and so are |ptrtoint| instructions converting
values of non-integral pointer types to integers. Vector versions of
said instructions are ill-typed as well."

One of the reasons it's experimental is because nobody has made it work
in all cases.

I think whoever wants this to work is going to have to drive fixing it
and making it work sanely.

Hopefully this won't derail this thread -- but I plan to resume work on
non-integral pointers very soon (mid December - early Jan). Right now I'm
busy with some higher priority things.

Oh, cool.

We have the same problem as David C., btw, that GVN tends to freely
convert between pointers and integers. We have local patches that fix old
GVN to DTRT, and the my plan was to upstream the custom patches predicated
on the pointer types. Same with instcombine (I don't remember if we have
other patches).

Neat.

I'm fine re-doing the same work on NewGVN (prevent inttoptr / ptrtoint on
certain class of pointers).

Sure, i'm just saying "I don't think it's forbidden except in that case, it
sounds like a new feature".

You keep talking about platforms, but llvm ir itself is not platform dependent.
Can you give a reference in the language reference that says that this is not legal?

Nothing in the LangRef (apart from the note about non-integral pointers, which was added recently) makes any claim about the representation of pointers. Pointers in LLVM IR have always been opaque and must explicitly be bitcast or inttoptr / ptrtoint cast to be used as if they were integers.

We have had discussions on the list previously about tightening up the semantics of inttoptr and ptrtoint.

IE what loads do *on your platform* is completely irrelevant to whether the IR code is legal or not, only what it codegens to.

LLVM's type semantics (and pointers may not have types, but the load operations produce values that do) are also not defined in terms of platform, but in terms of what datalayout says, etc.

GVN is materialising loads that go beyond the bounds of an object. This is undefined behaviour in C and there is nothing in the LangRef that indicates that this should be valid. It is only potentially valid because, on platforms with a page-based MMU as the sole form of memory protection, if you only round up to a power of two then you will still be in the same page (and, likely, cache line) so you will get some unspecified data and can ignore it.

What you want seems to be non-integral pointer types.

Which are experimental:
"LLVM IR optionally allows the frontend to denote pointers in certain address spaces as “non-integral” via the datalayout string. Non-integral pointer types represent pointers that have an unspecified bitwise representation; that is, the integral representation may be target dependent or unstable (not backed by a fixed integer).
inttoptr instructions converting integers to non-integral pointer types are ill-typed, and so are ptrtoint instructions converting values of non-integral pointer types to integers. Vector versions of said instructions are ill-typed as well."

One of the reasons it's experimental is because nobody has made it work in all cases.
I think whoever wants this to work is going to have to drive fixing it and making it work sanely.

Actually, that isn’t what I want, because we do define inttoptr and ptrtoint for our architecture. You can’t implement C without them (or some equivalent) working and we have a fully working C / Objective-C compiler (C++ in progress) using LLVM. ptrtoint is always valid for us, inttoptr may give null depending on the ABI and environment.

I gave a talk in the LLVM track at FOSDEM a couple of years ago about the things that are needed to make LLVM work correctly for targets where integers are not pointers. We have done most of this work, but it is not helped by people propagating the ‘integers are pointers’ assumption (which the LangRef has always been *very* careful not to state) in passes.

David

Very nice to see it!

Piotr

Do you happen to have a link for the talk? We'll try to make sure this
works in the new pass.

Davide,

Slides and video available at FOSDEM 2015 - The CHERI CPU

Kind regards,
Arnaud

That gives some background on our architecture. The talk I was thinking of was this one:

http://llvm.org/devmtg/2015-02/slides/chisnall-pointers-not-int.pdf

David

Glad to see this landing! It's been a long time coming.

Once this is in, please do not turn it on by default immediately. Let's call for volunteers to find some of the most egregious miscompiles, fix them, and then turn this on by default.

Philip

Agreed.
Davide and I agreed that the right play is to cut it down to the core part, and submit that, and then add all the stuff on top.
It would be really nice to get the core part of it well tested so that when we add the stuff like load coercion, etc, we don’t also have to debug too many issues in the core of it.

There are no immediate plans to enable NewGVN by default (at least,
not in the near future). In fact, the mail that I originally wrote
doesn't at all mention the switch, neither any follow-ups from me or
Daniel, so, I'm not entirely sure where you got that idea from. If you
take a look more closely (at the mail, or a the patch), you'll realize
that "key" pieces that are in old GVN are still missing. The most
noticeable are PRE and load coercion. In other words, the patch
proposed is not (yet) on par with what the current GVN does (although
all the missing pieces are already implemented out-of-tree).

Also, let me try to clarify one point. This is already a call for
volunteers. If you feel adventurous, you can download the
patch/apply/test/report issues. I can and I will spend time
integrating the rest of the work and fix all the reported
bugs/miscompiles. If there's something that can we do in a cleaner
way, a discussion will happen on the mailing list/on the review thread
and everybody will have a chance to comment, as it's happening for the
initial patch (and as I always try to do).

Once the first patch lands, I'll commit a temporary cl::opt to enable
NewGVN for those interested in testing and send another CFT e-mail.
FWIW, The patch had already a round of light testing internally. Of
course, this is not enough or indicative of its maturity/robustness. I
plan to have it tested more carefully inside my organization in
parallel.

That said, thanks for you input.

100% agree that the core algorithm should be very well tested before
moving forward with the other pieces.