Uncovering non-determinism in LLVM - The Next Steps

Hi all,

Last year I had shared with the community my findings about instances of non-determinism in llvm codegen. The major source of which was the iteration of unordered containers resulting in non-deterministic iteration order. In order to uncover such instances we had introduced "reverse iteration" of unordered containers (currently only enabled for SmallPtrSet).
I would now like to take this effort forward and propose to do the following:

1. We are in the process of setting up an internal nightly buildbot which would build llvm with the cmake flag -DLLVM_REVERSE_ITERATION:BOOL=ON.
This will make all supported containers iterate in reverse order by default. We would then run "ninja check-all". Any failing unit test is a sign of a potential non-determinism.

2. With the toolchain built by the above buildbot we also want to run LNT tests and our entire internal testing suite. We then want to compare the objdump for every obj file against the obj file compiled by a forward/default iteration toolchain. We ideally want to compare rel vs rel+asserts vs debug with Linux vs Windows toolchains. Any differences in objdumps could signal a potential non-determinism.

3. Currently reverse iteration is enabled only for SmallPtrSet. I am in the process of implementing it for more containers. I have already put up a patch for DenseMap: https://reviews.llvm.org/D35043

4. Simply compiling with -mllvm -reverse-iterate will help us uncover non-determinism due to iteration order of containers. But once we have enabled reverse iteration for several containers then pinpointing the exact container causing the problem would be more tedious. So I propose to add an optional value field to this flag, like -mllvm -reverse-iterate=smallptrset -mllvm -reverse-iterate=densemap, etc.

I would like to hear the community's thoughts on these proposals.

Thanks,
Mandeep

From: llvm-dev [mailto:llvm-dev-bounces@lists.llvm.org] On Behalf Of
Grang, Mandeep Singh via llvm-dev
Sent: Thursday, July 06, 2017 2:56 AM
To: llvm-dev@lists.llvm.org
Subject: [llvm-dev] Uncovering non-determinism in LLVM - The Next Steps

Hi all,

Last year I had shared with the community my findings about instances of
non-determinism in llvm codegen. The major source of which was the
iteration of unordered containers resulting in non-deterministic
iteration order. In order to uncover such instances we had introduced
"reverse iteration" of unordered containers (currently only enabled for
SmallPtrSet).
I would now like to take this effort forward and propose to do the
following:

1. We are in the process of setting up an internal nightly buildbot
which would build llvm with the cmake flag -
DLLVM_REVERSE_ITERATION:BOOL=ON.
This will make all supported containers iterate in reverse order by

I hope you mean all supported *unordered* containers here. :slight_smile:

default. We would then run "ninja check-all". Any failing unit test is a
sign of a potential non-determinism.

When you did this with SmallPtrSet, were there tests that failed but
did not actually indicate non-determinism?

2. With the toolchain built by the above buildbot we also want to run
LNT tests and our entire internal testing suite. We then want to compare
the objdump for every obj file against the obj file compiled by a
forward/default iteration toolchain.

Sounds like a plan.

We ideally want to compare rel vs
rel+asserts vs debug with Linux vs Windows toolchains. Any differences
in objdumps could signal a potential non-determinism.

You might want to do this part first, partly because you can do it
today with no further patches to LLVM. I would not be surprised if
the set of differences between rel/rel+asserts was non-empty, as
(I think) +asserts enables some things that actually do different
computations, and not just add some internal-consistency checks.
Also doing this first would let you smoke out any infrastructure
issues in the mechanics of comparing the output of two different
compilers, without the complication of a (relatively) untried option.

My team has done something like your proposal, but comparing with/without
-g and -S. In those cases there are a number of innocuous differences
that we have had to teach our comparison tool to ignore. But we also
found some real bugs and that was very satisfying.

When you start talking about Linux vs Windows, though... I assume you
mean comparing the output from different *hosts* but the same *target.*
Is LNT set up to do cross-compilation? This is just my ignorance of
LNT asking.

There might also be differences across hosts that you just have to live
with. I remember a compiler that had different output depending on
whether it was a 32-bit or 64-bit host, because the constant folder
depended on host arithmetic. (The compiled programs *worked* identically,
but the one compiled on a 32-bit host was a tiny bit less optimized.)
I don't know that LLVM has any issues like this, but it's something to
be aware of.

3. Currently reverse iteration is enabled only for SmallPtrSet. I am in
the process of implementing it for more containers. I have already put
up a patch for DenseMap: ⚙ D35043 [ADT] Enable reverse iteration for DenseMap

4. Simply compiling with -mllvm -reverse-iterate will help us uncover
non-determinism due to iteration order of containers. But once we have
enabled reverse iteration for several containers then pinpointing the
exact container causing the problem would be more tedious. So I propose
to add an optional value field to this flag, like -mllvm
-reverse-iterate=smallptrset -mllvm -reverse-iterate=densemap, etc.

Seems quite reasonable.

I would like to hear the community's thoughts on these proposals.

I think it's great. There's the obvious potential for combinatoric
explosion of test runs you *could* compare, and I think some of the
other possibilities (+/-asserts, different hosts) are equally
interesting, but I'm really pleased to see this kind of creative
approach to testing.

Thanks!
--paulr

Grang, Mandeep Singh wrote:

I would like to hear the community's thoughts on these proposals.

From the perspective of the recent focused effort on reproducible

builds, I think this is excellent.

We ideally want to compare rel vs
rel+asserts vs debug with Linux vs Windows toolchains. Any differences
in objdumps could signal a potential non-determinism.

You might want to do this part first, partly because you can do it
today with no further patches to LLVM. ...

My team has done something like your proposal, but comparing with/without
-g and -S. In those cases there are a number of innocuous differences
that we have had to teach our comparison tool to ignore. But we also
found some real bugs and that was very satisfying.

It does seem like a good idea to start with comparing output compiled
on 32- and 64-bit hosts, or on Windows and Linux. Out of curiosity
what kinds of changes arise from the presence/absence of -g? (It's not
just the expected presence/absence of the debug info?)

There might also be differences across hosts that you just have to live
with.

Being able to build identical objects using the same version of Clang
built with different host compilers on different operating systems is
nice in that it allows use of techniques from diverse double
compilation to improve confidence in the toolchain. It would be great
to at least know if / how objects currently differ when compiled in
cases like this.

Out of curiosity
what kinds of changes arise from the presence/absence of -g? (It's not
just the expected presence/absence of the debug info?)

Adding -g does have codegen effects, the most common reason being
that it adds IR instructions, some of which have uses of variables,
and sometimes passes don't ignore those things properly. I think
all the really egregious cases have been fixed by now (at one point
they were counted in inlining costs, which meant that -g affected
inlining decisions!), but we see it now and again.

There can also be minor instruction-ordering effects because -g wants
to emit unwind info, which appears in the form of assembler .cfi_*
directives, and those appear to be instruction-ordering barriers.
That's the one that pops to mind as something we haven't tried to fix.

There might be others, I haven't looked at that stuff lately.
--paulr

Hi all,

Last year I had shared with the community my findings about instances of
non-determinism in llvm codegen. The major source of which was the
iteration of unordered containers resulting in non-deterministic
iteration order. In order to uncover such instances we had introduced
“reverse iteration” of unordered containers (currently only enabled for
SmallPtrSet).
I would now like to take this effort forward and propose to do the
following:

  1. We are in the process of setting up an internal nightly buildbot
    which would build llvm with the cmake flag -DLLVM_REVERSE_ITERATION:BOOL=ON.
    This will make all supported containers iterate in reverse order by
    default. We would then run “ninja check-all”. Any failing unit test is a
    sign of a potential non-determinism.

  2. With the toolchain built by the above buildbot we also want to run
    LNT tests and our entire internal testing suite. We then want to compare
    the objdump for every obj file against the obj file compiled by a
    forward/default iteration toolchain. We ideally want to compare rel vs
    rel+asserts vs debug with Linux vs Windows toolchains. Any differences
    in objdumps could signal a potential non-determinism.

  3. Currently reverse iteration is enabled only for SmallPtrSet. I am in
    the process of implementing it for more containers. I have already put
    up a patch for DenseMap: https://reviews.llvm.org/D35043

As mentioned on the review thread (discussion may be better here, but perhaps since it’s already started on the review (sorry I didn’t see this thread earlier/discuss more here)… dunno):

SmallPtrSet not only has unspecified iteration order, it also is keyed off only pointers and pointers change run-over-run of the compiler. Unspecified iteration order/hashing currently does not change and wouldn’t unless it was done deliberately to flush out issues like this.

So I can see a solid argument for reverse iteration triggering on any pointer keyed container (even those with guaranteed ordering, in fact - because, again, the pointer values aren’t guaranteed).

I suppose what we’d really want (not realistic, but at least a hypothetical) is for std::less<T*> in containers to invert its comparison under this flag… (then even cases of pointer ordering in non-containers could be flushed out, if everything actually used std::less<T*>)

I wonder if we could create a compiler flag that would compile < on pointers to > instead? Hmm, I guess that would break range checks, though… :frowning:

Ah well, maybe just reversing iteration in containers with unspecified ordering and pointer keys is enough.

It’s possible we could flush out all dependence on hash container ordering, but that may not be worthwhile in a small-ish project like LLVM (it’d make sense if we, say, had a dependence on Boost containers - to ensure it was easy to upgrade to a newer Boost container implementation, it might be worthwhile ensuring we didn’t grow dependence on the container’s ordering even over non-pointer keys, etc)

How much of a priority is it to find codegen changes triggered by flags such as -g? As long as the effects are reliable between runs this shoud be really easy to attack using C-Reduce. I can look into it if this is important.

John

We’ve had a lot of success using cfe/trunk/utils/check_cfc in combination with C-Reduce to find, report and fix -g codegen change bugs (see http://llvm.org/devmtg/2015-04/slides/Verifying_code_gen_dash_g_final.pdf ). There are definitely a fair few left to go though. Sadly it’s rare that they get near to the top of our priority list to deal with, but it would be nice to get the test suites all running green with check_cfc so that we can more easily quickly spot regressions, so the more people willing to attack it the better I think.

-Greg

An example of this is the order of predecessors in the IR in phi nodes.
There are passes that will create them in different orders depending on
smallptrset iteration.
This is "non-deterministic" in the sense that the textual form is
different, but has the same semantic meaning either way.
(Let's put aside the fact that allowing them to have a different order
than the actual block predecessors is a pointless waste of time :P)

Whether you consider this non-deterministic depends on your goal.

I would argue that any pass that behaves differently given
phi [[1, block 1], [2, block 2]]
and
phi [[2, block 2], [1, block 1]]

is just flat out broken (and we have some that break due to poor design,
etc)

So i wouldn't consider the above to be non-deterministic in any meaningful
sense, despite it outputting different textual form.

(see http://llvm.org/devmtg/2015-04/slides/Verifying_code_gen_dash_g_final.pdf

Super cool, I had missed this. I hope to be doing some more LLVM testing work soonish and will keep this tool on the list.

John

> From: llvm-dev [mailto:llvm-dev-bounces@lists.llvm.org] On Behalf Of
> Grang, Mandeep Singh via llvm-dev
> Sent: Thursday, July 06, 2017 2:56 AM
> To: llvm-dev@lists.llvm.org
> Subject: [llvm-dev] Uncovering non-determinism in LLVM - The Next Steps
>
> Hi all,
>
> Last year I had shared with the community my findings about instances of
> non-determinism in llvm codegen. The major source of which was the
> iteration of unordered containers resulting in non-deterministic
> iteration order. In order to uncover such instances we had introduced
> "reverse iteration" of unordered containers (currently only enabled for
> SmallPtrSet).
> I would now like to take this effort forward and propose to do the
> following:
>
> 1. We are in the process of setting up an internal nightly buildbot
> which would build llvm with the cmake flag -
> DLLVM_REVERSE_ITERATION:BOOL=ON.
> This will make all supported containers iterate in reverse order by

I hope you mean all supported *unordered* containers here. :slight_smile:

> default. We would then run "ninja check-all". Any failing unit test is a
> sign of a potential non-determinism.

When you did this with SmallPtrSet, were there tests that failed but
did not actually indicate non-determinism?

An example of this is the order of predecessors in the IR in phi nodes.
There are passes that will create them in different orders depending on
smallptrset iteration.
This is "non-deterministic" in the sense that the textual form is
different, but has the same semantic meaning either way.
(Let's put aside the fact that allowing them to have a different order
than the actual block predecessors is a pointless waste of time :P)

Whether you consider this non-deterministic depends on your goal.

I would argue that any pass that behaves differently given
phi [[1, block 1], [2, block 2]]
and
phi [[2, block 2], [1, block 1]]

is just flat out broken (and we have some that break due to poor design,
etc)

So i wouldn't consider the above to be non-deterministic in any meaningful
sense, despite it outputting different textual form.

One of our definitions of non-determinism is simply "output from command
line tools should always be bit identical given identical inputs", which is
suitable for content-based caching build systems like Bazel.

I don't know how our bitcode encoding compares to the textual IR in the
case of your phi example, but assuming that that difference makes it into
the bitcode too, it would cause e.g. ThinLTO bitcode artifacts to violate
the content-based caching assumptions, even if semantically to the compiler
the difference is immaterial. I know that Richard has mentioned in the past
at least for Clang the intention is bit-identical output for bit-identical
input.

-- Sean Silva

> From: llvm-dev [mailto:llvm-dev-bounces@lists.llvm.org] On Behalf Of
> Grang, Mandeep Singh via llvm-dev
> Sent: Thursday, July 06, 2017 2:56 AM
> To: llvm-dev@lists.llvm.org
> Subject: [llvm-dev] Uncovering non-determinism in LLVM - The Next Steps
>
> Hi all,
>
> Last year I had shared with the community my findings about instances
of
> non-determinism in llvm codegen. The major source of which was the
> iteration of unordered containers resulting in non-deterministic
> iteration order. In order to uncover such instances we had introduced
> "reverse iteration" of unordered containers (currently only enabled for
> SmallPtrSet).
> I would now like to take this effort forward and propose to do the
> following:
>
> 1. We are in the process of setting up an internal nightly buildbot
> which would build llvm with the cmake flag -
> DLLVM_REVERSE_ITERATION:BOOL=ON.
> This will make all supported containers iterate in reverse order by

I hope you mean all supported *unordered* containers here. :slight_smile:

> default. We would then run "ninja check-all". Any failing unit test is
a
> sign of a potential non-determinism.

When you did this with SmallPtrSet, were there tests that failed but
did not actually indicate non-determinism?

An example of this is the order of predecessors in the IR in phi nodes.
There are passes that will create them in different orders depending on
smallptrset iteration.
This is "non-deterministic" in the sense that the textual form is
different, but has the same semantic meaning either way.
(Let's put aside the fact that allowing them to have a different order
than the actual block predecessors is a pointless waste of time :P)

Whether you consider this non-deterministic depends on your goal.

I would argue that any pass that behaves differently given
phi [[1, block 1], [2, block 2]]
and
phi [[2, block 2], [1, block 1]]

is just flat out broken (and we have some that break due to poor design,
etc)

So i wouldn't consider the above to be non-deterministic in any
meaningful sense, despite it outputting different textual form.

One of our definitions of non-determinism is simply "output from command
line tools should always be bit identical given identical inputs", which is
suitable for content-based caching build systems like Bazel.

Just to point out: These systems already often have to ignore whitespace
differences, etc.

I'd also argue any of these content-based caching build systems is going to
be rarely used if they require every single tool that uses them to produce
bit identical output (IE boiling the ocean), as opposed to letting the
tools define what identical means or something.

I don't know how our bitcode encoding compares to the textual IR in the
case of your phi example, but assuming that that difference makes it into
the bitcode too, it would cause e.g. ThinLTO bitcode artifacts to violate
the content-based caching assumptions, even if semantically to the compiler
the difference is immaterial.

You are certainly welcome to try to make all these things that are
semantically identical try to be completely syntactically identical as
well, but i'm pretty uninterested in slowing down or banning passes from
doing certain things to do that.

IE if you want this to happen, IMHO, this should be happening in output
writing, not somewhere else.

To give another example: Currently, IIRC, the order of basic blocks the
function iterator goes through is "as they appear in input".
There is no real defined or required ordering (IE it's not, for example,
sorted in a preorder walk from the entry block or something) for
correctness.

I could make a pass that randomizes the list which for (auto &BB : F)
walks, and nothing else in the compiler should change :slight_smile:

I don't think you can say such a pass is broken.

Hence I think if you want such constraints, they need to be placed on
output orderings, not on what passes do.

I know that Richard has mentioned in the past at least for Clang the

> From: llvm-dev [mailto:llvm-dev-bounces@lists.llvm.org] On Behalf Of
> Grang, Mandeep Singh via llvm-dev
> Sent: Thursday, July 06, 2017 2:56 AM
> To: llvm-dev@lists.llvm.org
> Subject: [llvm-dev] Uncovering non-determinism in LLVM - The Next
Steps
>
> Hi all,
>
> Last year I had shared with the community my findings about instances
of
> non-determinism in llvm codegen. The major source of which was the
> iteration of unordered containers resulting in non-deterministic
> iteration order. In order to uncover such instances we had introduced
> "reverse iteration" of unordered containers (currently only enabled
for
> SmallPtrSet).
> I would now like to take this effort forward and propose to do the
> following:
>
> 1. We are in the process of setting up an internal nightly buildbot
> which would build llvm with the cmake flag -
> DLLVM_REVERSE_ITERATION:BOOL=ON.
> This will make all supported containers iterate in reverse order by

I hope you mean all supported *unordered* containers here. :slight_smile:

> default. We would then run "ninja check-all". Any failing unit test
is a
> sign of a potential non-determinism.

When you did this with SmallPtrSet, were there tests that failed but
did not actually indicate non-determinism?

An example of this is the order of predecessors in the IR in phi nodes.
There are passes that will create them in different orders depending on
smallptrset iteration.
This is "non-deterministic" in the sense that the textual form is
different, but has the same semantic meaning either way.
(Let's put aside the fact that allowing them to have a different order
than the actual block predecessors is a pointless waste of time :P)

Whether you consider this non-deterministic depends on your goal.

I would argue that any pass that behaves differently given
phi [[1, block 1], [2, block 2]]
and
phi [[2, block 2], [1, block 1]]

is just flat out broken (and we have some that break due to poor design,
etc)

So i wouldn't consider the above to be non-deterministic in any
meaningful sense, despite it outputting different textual form.

One of our definitions of non-determinism is simply "output from command
line tools should always be bit identical given identical inputs", which is
suitable for content-based caching build systems like Bazel.

Just to point out: These systems already often have to ignore whitespace
differences, etc.

I'd also argue any of these content-based caching build systems is going
to be rarely used if they require every single tool that uses them to
produce bit identical output (IE boiling the ocean), as opposed to letting
the tools define what identical means or something.

I don't know how our bitcode encoding compares to the textual IR in the
case of your phi example, but assuming that that difference makes it into
the bitcode too, it would cause e.g. ThinLTO bitcode artifacts to violate
the content-based caching assumptions, even if semantically to the compiler
the difference is immaterial.

You are certainly welcome to try to make all these things that are
semantically identical try to be completely syntactically identical as
well, but i'm pretty uninterested in slowing down or banning passes from
doing certain things to do that.

IE if you want this to happen, IMHO, this should be happening in output
writing, not somewhere else.

To give another example: Currently, IIRC, the order of basic blocks the
function iterator goes through is "as they appear in input".
There is no real defined or required ordering (IE it's not, for example,
sorted in a preorder walk from the entry block or something) for
correctness.

I could make a pass that randomizes the list which for (auto &BB : F)
walks, and nothing else in the compiler should change :slight_smile:

I don't think you can say such a pass is broken.

Hence I think if you want such constraints, they need to be placed on
output orderings, not on what passes do.

Yeah, I agree it is mostly (if not entirely) an output canonicalization
issue in these cases.

-- Sean Silva

This seems inconsistent with my understanding of LLVM’s (& Google’s, re: Bazel) goals…

So far as I know, LLVM has fixed pretty much any “the same input causes different output bits, even if they’re semantically equivalent” bug that’s reported (well, they’re considered bugs at least - sometimes takes a while for someone to prioritize them)

While it doesn’t outright break Bazel when that property doesn’t hold, I believe it can cause more cache invalidation than is ideal (eg: build system evicts an object file from the cache, but still has the resulting executable in the cache - to do the link step it goes to rebuild the objects*, finds a new/different object and then reruns the link because of it)

Also for LLVM testing purposes, I believe any case where the output text/bits are different are usually fixed so the tests are reliable.

  • maybe this scenario doesn’t really happen - if Bazel assumes reproducibility is transitive, it could observe that the input hashes are the same so it doesn’t need to run the task at all if all the other object files are available - could just assume the resulting binary would be the same as the one it already has

No optimizations should change, but I think it’d be pretty difficult to support testing that pass (granted with LLVM’s testing approach, at least that would be fairly isolated to only the tests for the pass - it wouldn’t pollute other pass tests with nondeterminism of output, so might not be too painful)

I kind of would say it’s broken. Arbitrarily reordering, I’d be OK with, nondeterministically reordering would seem not OK to me (again, based on the sentiments I’ve heard expressed from Chandler and others (though I may’ve misunderstood/may be incorrectly representing their perspective) on the project/my best understanding of the goals, etc - and I tend to agree with them, but could well be wrong)

Ah, hmm, maybe there’s a point of confusion. Changing the order of BBs in a Function would change the output ordering, right? Presumably that would be reflected in printed IR, bitcode, and possibly in block layout (at least at -O0, but I could imagine some aspects of that ordering leak out even when LLVM does advanced block layout optimizations).

One of our definitions of non-determinism is simply "output from command
line tools should always be bit identical given identical inputs", which is
suitable for content-based caching build systems like Bazel.

Just to point out: These systems already often have to ignore whitespace
differences, etc.

I'd also argue any of these content-based caching build systems is going
to be rarely used if they require every single tool that uses them to
produce bit identical output (IE boiling the ocean), as opposed to letting
the tools define what identical means or something.

This seems inconsistent with my understanding of LLVM's (& Google's, re:
Bazel) goals..

My point is that they are certainly welcome to try this path, but it's been
tried before.
There are reasons ccache, et al have various "slop" modes.

I suspect, but can't prove, a system that is so rigid will not be adopted
in practice.
(So far: Only one major player has done so, and did so by fiat!)

So far as I know, LLVM has fixed pretty much any "the same input causes
different output bits, even if they're semantically equivalent" bug that's
reported (well, they're considered bugs at least - sometimes takes a while
for someone to prioritize them)

Which, again, i'm fine with as long as we don't muck up passes a ton to do
it. I'd rather design them properly and fix things like "random limits",
etc.

While it doesn't outright break Bazel when that property doesn't hold, I
believe it can cause more cache invalidation than is ideal (eg: build
system evicts an object file from the cache, but still has the resulting
executable in the cache - to do the link step it goes to rebuild the
objects*, finds a new/different object and then reruns the link because of
it)

Also for LLVM testing purposes, I believe any case where the output
text/bits are different are usually fixed so the tests are reliable.

FWIW: In the cases i cited the tests can already be made reliable in the
face of these differences :slight_smile:

* maybe this scenario doesn't really happen - if Bazel assumes
reproducibility is transitive, it could observe that the input hashes are
the same so it doesn't need to run the task at all if all the other object
files are available - could just assume the resulting binary would be the
same as the one it already has

Right.
This is definitely faster :slight_smile:

No optimizations should change, but I think it'd be pretty difficult to
support testing that pass (granted with LLVM's testing approach, at least
that would be fairly isolated to only the tests for the pass - it wouldn't
pollute other pass tests with nondeterminism of output, so might not be too
painful)

I don't think you can say such a pass is broken.

I kind of would say it's broken.

Why?
It makes literally no semantic change to the code that is incorrect.
This is precisely why my view is that if you want *output* to look a
certain way, you need to cause the *output emission* to handle that.

Arbitrarily reordering, I'd be OK with, nondeterministically reordering
would seem not OK to me (again, based on the sentiments I've heard
expressed from Chandler and others (though I may've misunderstood/may be
incorrectly representing their perspective) on the project/my best
understanding of the goals, etc - and I tend to agree with them, but could
well be wrong)

Hence I think if you want such constraints, they need to be placed on
output orderings, not on what passes do.

Ah, hmm, maybe there's a point of confusion. Changing the order of BBs in
a Function would change the output ordering, right?

Yes

Presumably that would be reflected in printed IR, bitcode, and possibly in
block layout (at least at -O0, but I could imagine some aspects of that
ordering leak out even when LLVM does advanced block layout optimizations).

Again, any ordering changes that cause optimization differences are clear
bugs in the pass.
Yes, i know that this means we have a lot of bugs :slight_smile: I consider any case
where, due to a limit, identical [1] code optimizes differently, to be a
bug. You should too!

block1:
foo
block2:
bar
should optimize the same as

block2:
bar
block1:
foo

Let me try to put this another way:
My argument is precisely that code that is literally identical (but subject
to expression in multiple textual forms) should be optimized the same way,
and that's the avoidance of non-determinism you should shoot for, *not* the
simple subset of "same crap in, same crap out" :P.

This form does *not* preclude the output form from being different for
things that meet [1], and if you want the output form to look the same,
that is a function of ordering output, not of "stopping passes from doing
things":
If you aren't ordering output, and just try to get the "same crap in
produces same crap out" form of non-determinism, you are going to get
screwed by all the random limits, etc, anyway as you improve the passes in
that you will have gratuitous, but deterministic, output change, and
probably some non-determinism anyway.

[1] For the purposes of this discussion, let's define this to mean the
following:
Code that is subject to multiple textual forms that have precisely the same
meaning to llvm, but generate different orderings in our containers/output:

IE textual input
block1:
foo
block2:
bar

should generate the same results as as textual input:

block2:
bar
block1:
foo

I’m not familiar with those - but don’t expect things would break outright if a tool was non-deterministic (though semantic preserving), but that determinism (in the same-stuff-in-same-stuff-out sense) is handy/helpful.

What product/player/thing are you referring to here? ^

I’m not sure what you mean by ‘random limits’.

I’m not sure “designing them properly” (in the sense of things like “order of basic blocks should not affect optimization quality/power”) is at odds with this - seems orthogonal & also desirable.

With LLVM’s infrastructure mostly around FileCheck I’ve found making tests order-independent to be somewhat difficult. Do you have particular techniques in mind? (eg: I care that one BB has feature X and one BB has feature Y - I’m not sure how to do that with LLVM’s test infrastructure)

I’m confused, maybe it’s too early or I’m not reading/understanding you correctly. I’m not sure how to communicate well here but think maybe we’re talking past each other. I’m not sure :confused:

Are you suggesting that LLVM IR/bitcode writing shouldn’t depend on the order of basic blocks in a Function? (or perhaps I’m misunderstanding/that’s coming back to your point that you don’t find it important that IR/Bitcode writing be deterministic?)

Sure - I’m OK with/agree with that - but that wasn’t really what I was thinking of/caring about in this thread/subject. I think that’s important/valuable/would be great (though difficult, I think*?) to fuzz passes to ensure that holds true.

  • Reordering BBs then checking the optimization output was semantically equivalent would require presumably complex semantic checking, or some kind of recanonicalization - sorting output BBs based on some property not related to the order they appear in?

I do, it’s just seems like a different point than whether or not output is deterministic/how to achieve that/whether it’s a goal/etc… so I’m confused by discussing these things together & talking about a position on one issue implying a position on the other…

I’m confused by the description of this as a subset - are you saying “semantically identical” (for want of a better term) code should be optimized to exactly the same bits as output? Or semantically identical bits as output?

OK, I guess no to that last question, if I’m reading this right ^. That means that same-crap-in-same-crap-out isn’t a subset of this goal, then?

continues to be confused

Agreed that sorting various things on output could be one way to achieve determinism of output. I can see how that ends up wrapping these two different issues up in this discussion.

I’m not sure I agree with that direction, though - seems expensive in other ways (sorting things after the fact may be more expensive, etc, rather than preserving determinism throughout) though I don’t know which comes out as the best tradeoff in the end. & hasn’t been the approach LLVM seems to have taken so far - though the approach so far may not be revisited/reconsidered.

I’m not sure email is the best way to continue this discussion, as it’s getting long and complex to follow, so i think at some point we should just go have a chat by one of our cubes. :slight_smile:

I will say i don’t believe achieving “every pass does not change ordering in containers such that it causes an output change” is a useful idea. As i’ve said, i believe that is for output writing to handle.

Passes simply do not and should not care about stuff like that.

In part because they do not not know what is happening with the IR.
It could be executed by a JIT, it could be splatted into a piece of hardware. It could be shipped over a network.
It could simply be thrown away.

The fact that something later happens to write it out is just one possibility, and if that thing wants some deterministic ordering of things in containers (or whatever else causes output ordering to differ), it should be responsible for making that happen, not try to force that cost into every pass :slight_smile:

Side-tracking on one thing:

Also for LLVM testing purposes, I believe any case where the output text/bits are different are usually fixed so the tests are reliable.

FWIW: In the cases i cited the tests can already be made reliable in the face of these differences :slight_smile:

With LLVM’s infrastructure mostly around FileCheck I’ve found making tests order-independent to be somewhat difficult. Do you have particular techniques in mind? (eg: I care that one BB has feature X and one BB has feature Y - I’m not sure how to do that with LLVM’s test infrastructure)

Assuming this requires multiple CHECK statements per feature of interest (minimally, one to identify the BB and one to check for the presence of the feature) what you need to do is capture the output and run it through FileCheck twice, with a different prefix for each set of checks.

That’s the only way I’m aware of to make sequences of checks order-independent.

You can often make individual CHECKs order-independent with -DAG (2nd prize winner of most-misleading-suffix-name) but not always.

Back to your regularly scheduled debate now.

–paulr

Thanks a lot for all your valuable suggestions/comments/opinions. For now, we will go ahead and set up our internal buildbot and infrastructure to uncover non-determinism and share our findings with the community. We can then decide whether we want to fix any “bugs” on a case-by-case basis. I think having the capability/infrastructure to uncover such “non-determinism” is a nice feature. Whether we want to actually fix the supposed “bugs” is a different discussion altogether :slight_smile:

Thanks,
Mandeep