The undef story

Philip,
email responses are varied, some say what you do, but
others say give the guys a chance and listen to what he has to say.

I say that I have a mild personality disorder such that I can’t say
things in politically correct style, and that this is a disability that I
have had for the last third of my adult life. I would like to think that
folks could accommodate someone with a disability. I would like
to think that llvm-dev could be such a place.

Peter Lawrence.

Is it possible to compile 32-bit programs on Mac? I know that on Linux you can typically install 32-bit libraries and run 32-bit programs. This is a bit extreme, but if that option isn’t available on Mac, you could dual-boot Linux on your machine (at least temporarily) to perform the measurements. If that isn’t possible, you might consider spinning up a Linux machine in the cloud (Amazon, Google Cloud, etc.) and doing the measurements there. Cloud machines aren’t the most stable for benchmarking, but with a decent number of runs and appropriate data analysis you should be able to get reliable results.

Not having access to SPEC isn’t an insurmountable hurtle either.
LLVM’s test-suite has a number of interesting benchmarks and is freely available and easy to set up: http://llvm.org/docs/TestSuiteMakefileGuide.html
If you ever acquire access to the SPEC sources, they are easy to drop in later if you need to.

– Sean Silva

Sanjoy,
you seem to be changing your story, when I first brought up the
function-inlining example you argued about how I should not use C/C++ for such
programs, and that you were not fixing it for some reason related to how the IR is an
abstraction and C/C++ wasn’t your problem, I pushed you on the issue by saying
so lets translate the C example into IR and then talk about it, you went silent
after that. No response.

Same thing when I brought up the not hoisting a loop invariant divide out of
a loop, you were silent about that, leading me to believe that you were not
addressing that either.

What changed ?

Peter Lawrence.

Hi Peter,

            you seem to be changing your story, when I first brought up the
function-inlining example you argued about how I should not use C/C++ for
such
programs, and that you were not fixing it for some reason related to how the
IR is an
abstraction and C/C++ wasn’t your problem, I pushed you on the issue by
saying
so lets translate the C example into IR and then talk about it, you went
silent
after that. No response.

I went silent because I have a limited budget to spend on llvm-dev.

However, I don't think I've changed my stance here. In this thread, I
said exactly the same thing as I've said before.

I don't see how "translating the example into C" is not addressed by
"IR is an abstraction and C/C++ wasn’t your problem". The C example
that would trigger the "unintuitive" behavior is:

void f() {
  int x;
  g(x);
}

void g(int y) {
  if (y == y)
    S;
}

which has UB, and this falls under the purview of "IR is an
abstraction and C/C++ wasn’t your problem", which is a less charitable
way of saying what I've said above in this thread.

Same thing when I brought up the not hoisting a loop invariant divide out of
a loop, you were silent about that, leading me to believe that you were not
addressing that either.

As I've said above, this is addressed in the paper.

-- Sanjoy

Peter, I’m a bit mystified by your response. I thought Sanjoy did a pretty good job giving answers to your technical questions. Sanjoy’s response was a bit terse on the technical points, so maybe I can elaborate a bit more.

Chandler,
               I am not a “politically correct” person, never have been, never will be.
If you are waiting for me to make a politically incorrect statement so you can jump
on it, let me assure you that you will never be disappointed.

Peter, let me be perfectly clear: We insist on a certain level of decorum on this list. You're over the line. We treat each other here with civility and respect. This is spelled out in more detail in our code of conduct: LLVM Community Code of Conduct — LLVM 18.0.0git documentation

But if that’s all you do then you and llvm lose out. If you want to actually help
llvm move forward then you should judge what I say based on its merit, not on its
delivery.

That is not our view. There are certain methods of delivery that are inappropriate. Does this mean that the community might miss out on technical advancement? Possibly. But we're willing to pay that price.

  -Hal

I can’t comment on SPEC, but this does remind me of code I was working on recently. To abstract the relevant parts, it looked something like this: template int do_something(T mask, bool cond) { if (mask & 2) return 1; if (cond) { T high_mask = mask >> 48; if (high_mask > 5) do_something_1(high_mask); else if (high_mask > 3) do_something_2(); } return 0; } This function ended up being instantiated on different types T (e.g. unsigned char, unsigned int, unsigned long, etc.) and, dynamically, cond was always false when T was char. The question is: Can the compiler eliminate all of the code predicated on cond for the smaller types? In this case, this code was hot, and moreover, performance depended on the fact that, for T = unsigned char, the function was inlined and the branch on cond was eliminated. In the relevant translation unit, however, the compiler would never see how cond was set. Luckily, we do the right thing here currently. In the case where T = unsigned char, we end up folding both of the high_mask tests as though they were false. That entire part of the code is eliminated, the function is inlined, and everyone is happy. Why was I looking at this? As it turns out, if the ‘else if’ in this example is just ‘else’, we don’t actually eliminate both sides of the branch. The same is true for many other variants of the conditionals (i.e. we don’t recognize all of the code as dead). Once we have a self-consistent model for undef, we should be able to fix that. The user was confused, however, why seemingly innocuous changes to the code changed the performance characteristics of their application. The proposed semantics by John, et al. should fix this uniformly. In any case, to your point about: I have the same thought. If a == undef here, the code should be dead. Dead code must be aggressively dropped to enable inlining and further optimization. This is an important way we eliminate abstraction penalties. Dead code also has costs in terms of register allocation, speculative execution, inlining, etc. I’ve also seen cases where templated types are used with fixed-sized arrays where the compiler to leveraged knowledge of UB on uninitialized values and out-of-bounds accesses to eliminate unnecessary part of the code. In short, “optimizing on undefined behavior” can end up being an important tool. -Hal

Yep. 32-bit libraries are always present and "-m32" works as usual.

Tim.

This is 100% not about political correctness.

It's about the repeated lack of empathy and understanding.
You believe the things you are saying are somehow right, and the reason
others disengage is because they can't handle the rightness.

This is your repeated quip about about others have an emotional reaction to
your rightness.

As mentioned repeatedly now, the reason they are disengaging, is, bluntly,
they think you are both wrong and and asshole.

I'd like to politely ask for whoever is capable of putting their ego aside
for a minute and re-approach things from a pure technical perspective.

Peter is not willing do so, why should others?

#1 Is there technical merit to what Peter is trying to convey? if yes
please reset.

No, sorry, this makes no sense.

Shouldn't we all be a little empathetic to the engineers who are
brilliant, but sometimes have a few rough edges?

People are empathetic. That is why they have spent time trying to help.

But past that, no, there is not enough genius to make up for being an
asshole.

Any company that has enough people has learned this over time.

People don't get a pass for being smart.
All the others on this thread are *also* brilliant, but somehow, they are
able to conform their behavior to accepted social norms.

Part of being a "brilliant" high level engineer is being able to present
your ideas in a way that isn't offensive to others.
In fact, part of that role is being able to convince others to go along
with you.
Otherwise, i'd argue, you are not as brilliant or high level as you think.

If the tone of an email comes across the wrong way, I thought there was an
established way to try to positively handle the situation?

People have repeatedly tried.

I'd actually urge the opposite of you. People should not be engaging here
until a discussion can be had in a normal manner without repeated insults,
etc.

Anything else sends a very wrong message about our community.

Yo.. if I can't get away with both directly insulting others and using
profanity.. you can't either.. Lets be fair.. Take it offlist if you want
to attack someone (justly or not)

I say that I have a mild personality disorder such that I can’t say

things in politically correct style, and that this is a disability that I

have had for the last third of my adult life. I would like to think that

folks could accommodate someone with a disability. I would like

to think that llvm-dev could be such a place.

You are not alone in making such a claim. However, claiming an inability to discern hurtful behavior does not give you license to commit hurtful behavior. In a community whose norms include avoiding hurtful behavior, it means you need to pay more attention to that possibility in order to meet those norms.

Here’s a procedure I try to use after writing an email, and that has been helpful:

if (mentioned someone or addressed remarks directly to someone)

and (attributed some non-technical motivation or emotional response to that person)

then remove or rewrite that part

Note that the nature (positive or negative) of the attributed motivation or response is not relevant, just that I am making such an attribution. Mostly I can identify those cases pretty easily.

It gets me to change statements like “Peter, you are emotionally incapable of participating positively in this community” to “Peter, here’s a tactic to help you participate more positively in this community.”

Hope this helps,

–paulr

Moving this discussion back to something productive - I think if there’s any chance of technical merit left it should be around bugs filed and maybe even patches attached. Purely technical and code to review would allow everyone to move past feelings and focus on what’s really important.

Sorry, thought I sent that privately

Paul,
        Yes, that helps, thanks for the encouragement, I am listening.
Peter.

I can’t comment on SPEC, but this does remind me of code I was working on recently. To abstract the relevant parts, it looked something like this: template int do_something(T mask, bool cond) { if (mask & 2) return 1; if (cond) { T high_mask = mask >> 48; if (high_mask > 5) do_something_1(high_mask); else if (high_mask > 3) do_something_2(); } return 0; } This function ended up being instantiated on different types T (e.g. unsigned char, unsigned int, unsigned long, etc.) and, dynamically, cond was always false when T was char. The question is: Can the compiler eliminate all of the code predicated on cond for the smaller types? In this case, this code was hot, and moreover, performance depended on the fact that, for T = unsigned char, the function was inlined and the branch on cond was eliminated. In the relevant translation unit, however, the compiler would never see how cond was set. Luckily, we do the right thing here currently. In the case where T = unsigned char, we end up folding both of the high_mask tests as though they were false. That entire part of the code is eliminated, the function is inlined, and everyone is happy. Why was I looking at this? As it turns out, if the ‘else if’ in this example is just ‘else’, we don’t actually eliminate both sides of the branch. The same is true for many other variants of the conditionals (i.e. we don’t recognize all of the code as dead).

I apologize in advance if I have missed something here and am misreading your example…

This doesn’t make sense to me, a shift amount of 48 is “undefined” for unsigned char,
How do we know this isn’t a source code bug,
What makes us think the the user intended the result to be “0”.

This strikes me as odd, we are mis-interpreting the user’s code
In such a way so as to improve performance, but that isn’t necessarily what the user intended.

Here’s one way to look at this issue, if something is “C undefined behavior” then
The standard says, among other things, that we could trap here
Why aren’t we doing that rather than optimizing it ?

Here’s another way to look at it, no one has ever filed a bug that reads
“I used undefined behavior in my program, but the optimizer isn’t taking advantage of it”
But if they do I think the response should be
“you should not expect that, standard says nothing positive about what undefined behavior does"

Once we have a self-consistent model for undef, we should be able to fix that. The user was confused, however, why seemingly innocuous changes to the code changed the performance characteristics of their application. The proposed semantics by John, et al. should fix this uniformly.

In any case, to your point about:

if (a == a)
S;

I have the same thought. If a == undef here, the code should be dead. Dead code must be aggressively dropped to enable inlining and further optimization. This is an important way we eliminate abstraction penalties. Dead code also has costs in terms of register allocation, speculative execution, inlining, etc.

And yet IIRC Sanjoy in his last email was arguing for consistent behavior in cases like
If (x != 0) {
/* we can optimize in the then-clause assuming x != 0 */
}
And in the case above when it is a function that gets inlined

Here’s what Sanjoy said about the function-inline case

This too is fixed in the semantics mentioned in the paper. This also
isn’t new to us, it is covered in section 3.1 “Duplicate SSA Uses”.

So this issue seems to be up in the air

I’ve also seen cases where templated types are used with fixed-sized arrays where the compiler to leveraged knowledge of UB on uninitialized values and out-of-bounds accesses to eliminate unnecessary part of the code. In short, “optimizing on undefined behavior” can end up being an important tool.

As you can tell from my first comments, I am not yet convinced, and would still like to see real evidence

Peter Lawrence.

It is a source bug, if the code is ever executed. This is in fact a
class of real world bugs as CPUs *do* implement overly large shifts
differently. There are two different views here:
(1) It obviously means the result should be zero, since all bits are
shifted out.
(2) It is faster to just mask the operand and avoid any compares in the
ALU. It has the nice side effect of simplifying rotation in software.

ARM and X86 are examples for those views.

Joerg

As I said, this is representation of what the real code did, and looked like, after other inlining had taken place, etc. In the original form, the user’s intent was clear. That code is never executed when T is a small integer type. That is exactly what the user intended. That’s why I brought it up as an example. We could. In fact, we have great tools (UBSan, ASan, etc.) that will instrument the code to do exactly that. You say that as though it is true. It is not. Yes, users file bugs like that (although they don’t often phrase it as “undefined behavior”, but rather, “the compiler should figure out that…”, and often, taking advantage of UB is the only available way for the compiler to figure that thing out). Type-based aliasing rules are another common case where this UB-exploitation comes up (although not in a way that directly deals with undef/poison). And, of course, often we do have to tell our users that the compiler has no way to figure something out. When we have a tool, and sometimes that tool is exploiting our assumptions that UB does not happen, then we use it. You may disagree with decisions to exploit certain classes of UB is certain situations, and that’s fine. We do use our professional judgment and experience to draw a line somewhere in this regard. I don’t believe these are contradictory statements. In the proposed semantics, we get to assume that branching on poison is UB, and thus, doesn’t happen. So, if it were inevitable on some code path, that code path must be dead. I understand. However, to say that it is not useful to optimize based on UB, even explicit UB, or that this is never something that users desire, is not true. -Hal

Hi Peter,

Here’s another way to look at it, no one has ever filed a bug that reads
“I used undefined behavior in my program, but the optimizer isn’t taking advantage of it”
But if they do I think the response should be
“you should not expect that, standard says nothing positive about what undefined behavior does"

Of course no one would file such a bug (since if your program has UB,
the first thing you do is fix your program). However, there are
plenty of bugs where people complain about: "LLVM does not optimize my
(UB-free) program under the assumption that it does not have UB"
(which is what poison allows):

https://bugs.llvm.org/show_bug.cgi?id=28429
https://groups.google.com/forum/#!topic/llvm-dev/JGsDrfvS5wc

Once we have a self-consistent model for undef, we should be able to fix
that. The user was confused, however, why seemingly innocuous changes to the
code changed the performance characteristics of their application. The
proposed semantics by John, et al. should fix this uniformly.

In any case, to your point about:

  if (a == a)
    S;

I have the same thought. If a == undef here, the code should be dead. Dead
code must be aggressively dropped to enable inlining and further
optimization. This is an important way we eliminate abstraction penalties.
Dead code also has costs in terms of register allocation, speculative
execution, inlining, etc.

And yet IIRC Sanjoy in his last email was arguing for consistent behavior
in cases like
If (x != 0) {
/* we can optimize in the then-clause assuming x != 0 */
}
And in the case above when it is a function that gets inlined

Here’s what Sanjoy said about the function-inline case

This too is fixed in the semantics mentioned in the paper. This also
isn't new to us, it is covered in section 3.1 "Duplicate SSA Uses".

So this issue seems to be up in the air

This issue is *not* up in the air -- the paper addresses this problem
in the new semantics in the way Hal described: since "if (poison ==
poison)" is explicitly UB in the new semantics, we will be able to
aggressively drop the comparison and everything that it dominates.

I've also seen cases where templated types are used with fixed-sized arrays
where the compiler to leveraged knowledge of UB on uninitialized values and
out-of-bounds accesses to eliminate unnecessary part of the code. In short,
"optimizing on undefined behavior" can end up being an important tool.

As you can tell from my first comments, I am not yet convinced, and would
still like to see real evidence

I'm not sure why what Hal mentioned does not count as real evidence.
The things he mentioned are cases where "exploiting" undefined
behavior results in less code size better performance.

-- Sanjoy

I will still have a hard time believing this until I see a real example, can you fill in the details ?

Peter Lawrence.