readonly and infinite loops

Running -early-cse on

declare void @rn() readnone nounwind

define void @f() {
entry:
  call void @rn()
  ret void
}

removes the call to @rn(). But @rn() could have had an infinite loop
in it in which case @f() went from being a non-terminating
program to an terminating no-op. Is this intentional?

The only way I can see this transform being legal is if infinite loops
are declared to have undefined behavior, but I could not find anything
in the LLVM specification that mentions this.

-- Sanjoy

At least in C/C++ that's UB, yes.
Some years ago there was a lot of discussion about this on the ML. Se Nick's patch, for example: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20100705/103670.html
The patch was eventually dropped, I don't recall why. I think it's fine to assume that readnone functions always terminate. It would probably fine for readonly functions as well.

Nuno

At least in C/C++ that's UB, yes.

So you cannot map every turing machine to a valid C/C++ program then. :slight_smile:

Also, does this mean that "daemon" programs that run continuously till
they're killed by the OS (using a mechanism that is not visible in C)
are effectively undefined?

-- Sanjoy

In C, dunno, but in LLVM, it means they aren't readonly :slight_smile:

No; the standard says that we get to assume that the loop will eventually terminate, or do I/O, or a few other possible things. I can find the text for you later if you’d like.

-Hal

Sent from my Verizon Wireless 4G LTE DROID

At least in C/C++ that’s UB, yes.

So you cannot map every turing machine to a valid C/C++ program then. :slight_smile:

Also, does this mean that “daemon” programs that run continuously till
they’re killed by the OS (using a mechanism that is not visible in C)
are effectively undefined?

– Sanjoy

No; the standard says that we get to assume that the loop will eventually
terminate, or do I/O, or a few other possible things. I can find the text
for you later if you'd like.

Actually, I'm more interested in what LLVM (not C/C++) says about
"while(1);" loops. Such loops are legal in Java (and possibly other
languages that use LLVM) so it's problematic if LLVM specifies them to
have UB.

-- Sanjoy

In that case, -functionattrs needs to be fixed:

define void @infloop() {
entry:
  br label %l

l:
  br label %l
}

== opt -functionattrs ==>

; Function Attrs: readnone
define void @infloop() #0 {
entry:
  br label %l

l: ; preds = %l, %entry
  br label %l
}

attributes #0 = { readnone }

-- Sanjoy

From: "Sanjoy Das" <sanjoy@playingwithpointers.com>
To: "Hal J. Finkel" <hfinkel@anl.gov>
Cc: "Nuno Lopes" <nunoplopes@sapo.pt>, "LLVM Developers Mailing List" <llvmdev@cs.uiuc.edu>
Sent: Saturday, June 27, 2015 7:17:16 PM
Subject: Re: [LLVMdev] readonly and infinite loops

> No; the standard says that we get to assume that the loop will
> eventually
> terminate, or do I/O, or a few other possible things. I can find
> the text
> for you later if you'd like.

Actually, I'm more interested in what LLVM (not C/C++) says about
"while(1);" loops. Such loops are legal in Java (and possibly other
languages that use LLVM) so it's problematic if LLVM specifies them
to
have UB.

I believe our general policy is not to declare such behavior undefined, at the IR level. However, we also need to be careful to specify what the observable effect of such a program would be, and how changes to that state should be interpreted.

-Hal

You dropped some context…

You dropped some context...

A daemon program wouldn't be readonly. An infinite loop can be.

Right.

To prevent miscommunication, here is a quick analysis of a problematic
(IMO) example:

We start with

define void @infloop(i1 %c) {
 entry:
  br i1 %c,  label %l, label %e

 l:
  br label %l

 e:
  ret void
}

define void @main_func() {
 entry:
  call void @infloop(i1 1)
  ret void
}

In this program `@main_func`, when called, will loop infinitely.

If we run this program through `opt -functionattrs -prune-eh
-early-cse`, we get

; Function Attrs: nounwind readnone
define void @infloop(i1 %c) #0 {
entry:
  br i1 %c, label %l, label %e

l:                                                ; preds = %l, %entry
  br label %l

e:                                                ; preds = %entry
  ret void
}

; Function Attrs: nounwind
define void @main_func() #1 {
entry:
  ret void
}

attributes #0 = { nounwind readnone }
attributes #1 = { nounwind }

LLVM has optimized `@main_func` to return immediately.
`-functionattrs` and `-prune-eh` infer `nounwind readnone` for
`@infloop` and `-early-cse` delets such calls that are dead.

There are three ways to justify what happened:

1. The optimization was correct. Infinite loops [1] in LLVM IR are
    UB and since the original program had UB, it can be optimized to
    anything. This is problematic since if this were true, we would
    have trouble writing frontends for programming languages that
    actually have a well defined `while(1);`.

2. The bug is in `-early-cse`: a `readnone nounwind` function can
    loop infinitely, so it is not correct to remove a dead call to
    such a function.

3. The bug is in `-functionattrs`: a function cannot be marked as
    `readonly` if it may loop infinitely.

Needless to say, both (2) and (3) are "easy to fix"; the question is
really about the deeper semantic issue about how LLVM treats
`while(1);`.

There are interesting related questions on the semantics of the
equivalent C/C++ programs, but I personally am interested
only in "pure" LLVM IR semantics.

-- Sanjoy

[1]: We could make the distinction between infinite loops that are
empty vs. infinite loops that are non-empty but that's tricky -- if
only empty infinite loops were UB then e.g. LICM gets harder, since
sinking the store out of loops like `while(1) *addr = 42;` now
introduces UB.

What's really I missing, I believe, is the equivalent of the GCC "pure" attribute - which explicitly forbids infinite loops. I think readnone + halting (like in the proposed 2010 patch) is good enough, though. Then every function emitted by a C/C++ FE could be marked as halting, and we could then fix bug [2] by requiring "readnone halting".
One potential issue with this is that "readnone halting" isn't strong enough to allow speculative execution of functions (because they may have other UB), but I'm not entirely sure GCC "pure" allows that either.

Regarding Java - what are the expected semantics of infinite loops without side-effects, per spec?
It seems that for "trivial" infinite loops (while(1):wink: the FE should actually generate a compile-time error, per JLS 14.21:
https://docs.oracle.com/javase/specs/jls/se8/html/jls-14.html#jls-14.21

The JLS does, however, explicitly claim that this is legal code:
{
    int n = 5;
    while (n > 7) k = 2;
}

But I don't see what the semantics of this are. Does it actually mandate non-termination?

Michael

What's really I missing, I believe, is the equivalent of the GCC "pure" attribute - which explicitly forbids infinite loops. I think readnone + halting (like in the proposed 2010 patch) is good enough, though. Then every function emitted by a C/C++ FE could be marked as halting, and we could then fix bug [2] by requiring "readnone halting".

This looks like the right fix to me too.

One potential issue with this is that "readnone halting" isn't strong enough to allow speculative execution of functions (because they may have other UB)

Agreed.

Regarding Java - what are the expected semantics of infinite loops without side-effects, per spec?
It seems that for "trivial" infinite loops (while(1):wink: the FE should actually generate a compile-time error, per JLS 14.21:
Chapter 14. Blocks and Statements

By hit and trial with `javac`, it looks like cases like these are legal:

public class P {
     public static int f() {
          while (true);
     }

     public static void main(String argv) {
          f();
     }
}

But you're supposed to get compiler errors in cases like this:

public class P {
     public static int f() {
          while (true);
          return 10;
     }

     public static void main(String argv) {
          f();
     }
}

Since "return 10;" is now an unreachable statement.

In any case, it was misleading of me to talk about Java (the language)
semantics; what I really care about are java *bytecode* semantics,
which is what we JIT compile via LLVM IR. And as far as I can tell,
there is nothing in the bytecode spec that rules out trivial infinite
loops. I will take a closer look and discuss this with my colleagues,
in case I've managed to miss something.

The JLS does, however, explicitly claim that this is legal code:
{
    int n = 5;
    while (n > 7) k = 2;
}

But I don't see what the semantics of this are. Does it actually mandate non-termination?

The execution will never enter the while loop, as far as I can tell.

-- Sanjoy

From: "Sanjoy Das" <sanjoy@playingwithpointers.com>
To: "Michael M Kuperstein" <michael.m.kuperstein@intel.com>
Cc: "LLVM Developers Mailing List" <llvmdev@cs.uiuc.edu>
Sent: Sunday, June 28, 2015 3:40:14 AM
Subject: Re: [LLVMdev] readonly and infinite loops

> What's really I missing, I believe, is the equivalent of the GCC
> "pure" attribute - which explicitly forbids infinite loops. I
> think readnone + halting (like in the proposed 2010 patch) is good
> enough, though. Then every function emitted by a C/C++ FE could be
> marked as halting, and we could then fix bug [2] by requiring
> "readnone halting".

This looks like the right fix to me too.

I agree, we'll need a halting attribute.

> One potential issue with this is that "readnone halting" isn't
> strong enough to allow speculative execution of functions (because
> they may have other UB)

Agreed.

> Regarding Java - what are the expected semantics of infinite loops
> without side-effects, per spec?
> It seems that for "trivial" infinite loops (while(1):wink: the FE
> should actually generate a compile-time error, per JLS 14.21:
> Chapter 14. Blocks and Statements

By hit and trial with `javac`, it looks like cases like these are
legal:

public class P {
     public static int f() {
          while (true);
     }

     public static void main(String argv) {
          f();
     }
}

But you're supposed to get compiler errors in cases like this:

public class P {
     public static int f() {
          while (true);
          return 10;
     }

     public static void main(String argv) {
          f();
     }
}

Since "return 10;" is now an unreachable statement.

In any case, it was misleading of me to talk about Java (the
language)
semantics; what I really care about are java *bytecode* semantics,
which is what we JIT compile via LLVM IR. And as far as I can tell,
there is nothing in the bytecode spec that rules out trivial infinite
loops. I will take a closer look and discuss this with my
colleagues,
in case I've managed to miss something.

In Java, threads can have, formally, a well-defined 'hang' state:

  Chapter 17. Threads and Locks

The argument, in part, from Jeremy Manson's memory-model work (which forms the basis for the current specification):

  http://drum.lib.umd.edu/bitstream/1903/1949/1/umi-umd-1898.pdf

for why otherwise-side-effect-free infinite loops in Java need to be well defined, is because a thread's state is observable:

  Thread.State (Java Platform SE 7 )

For C++, we could actually make a similar argument (we have clocks, threads with join(), etc.), but the committee wisely chose to keep the original prohibition on non-terminating side-effect-free executions regardless.

-Hal

People have been aware of this issue for a long time (as you can see by Nick's 2010 patch). I guess it was not pushed further because of lack of practical interest.
Maybe Nick remembers more about the issue and why the patch was dropped.

Sanjoy: do you have a practical concern about this issue? I mean, of course this can be "fixed", but it will require some work, and even a termination checker for -functionattrs for the non-C/C++ frontends.

Nuno

From: "Nuno Lopes" <nunoplopes@sapo.pt>
To: "Sanjoy Das" <sanjoy@playingwithpointers.com>, "Jeremy Lakeman" <Jeremy.Lakeman@gmail.com>, nlewycky@google.com
Cc: "LLVM Developers Mailing List" <llvmdev@cs.uiuc.edu>
Sent: Sunday, June 28, 2015 6:08:52 AM
Subject: Re: [LLVMdev] readonly and infinite loops

People have been aware of this issue for a long time (as you can see
by
Nick's 2010 patch). I guess it was not pushed further because of
lack of
practical interest.

I can't comment about the rationale in 2010, but this came up again when I was originally working on heap-to-stack conversion. Being able to mark functions as halting is essential for doing heap-to-stack conversion. This was put on hold, however, essentially awaiting the new pass manager. The issue is that you'd like to be able to use SCEV in a module/CGSCC-level pass to infer the attribute on functions with loops, and this cannot be done directly until we have the new pass manager.

On the other hand, maybe we don't need to wait for that now. We now also have a scheme for tagging loops with metadata, and we could use that to add metadata to loops known to have a finite trip count, and then use that metadata in a pass that infers the attribute (FunctionAttrs presumably).

-Hal

Sanjoy: do you have a practical concern about this issue? I mean, of course

I don't have any immediate practical concerns and I've not actually
seen miscompiles resulting from this. But fixing this is certainly
more important than a "nice to have" -- we cannot miscompile "while
(true) ;" loops.

-- Sanjoy

People have been aware of this issue for a long time (as you can see
by
Nick's 2010 patch). I guess it was not pushed further because of
lack of
practical interest.

I can't comment about the rationale in 2010, but this came up again when I was originally working on heap-to-stack conversion. Being able to mark functions as halting is essential for doing heap-to-stack conversion. This was put on hold, however, essentially awaiting the new pass manager. The issue is that you'd like to be able to use SCEV in a module/CGSCC-level pass to infer the attribute on functions with loops, and this cannot be done directly until we have the new pass manager.

Interesting. Could you give an example why knowing a function will halt is essential for the heap-to-stack conversion? It's definitely an optimization we should be doing (although, correct me if I'm wrong, LLVM already has some form of this for very small mallocs, no?)

Thanks,
Nuno

From: "Nuno Lopes" <nunoplopes@sapo.pt>
To: "Hal Finkel" <hfinkel@anl.gov>
Cc: "LLVM Developers Mailing List" <llvmdev@cs.uiuc.edu>, "Sanjoy Das" <sanjoy@playingwithpointers.com>, "Jeremy
Lakeman" <Jeremy.Lakeman@gmail.com>, nlewycky@google.com
Sent: Sunday, June 28, 2015 4:39:44 PM
Subject: Re: [LLVMdev] readonly and infinite loops

>> People have been aware of this issue for a long time (as you can
>> see
>> by
>> Nick's 2010 patch). I guess it was not pushed further because of
>> lack of
>> practical interest.
>
> I can't comment about the rationale in 2010, but this came up again
> when I
> was originally working on heap-to-stack conversion. Being able to
> mark
> functions as halting is essential for doing heap-to-stack
> conversion. This
> was put on hold, however, essentially awaiting the new pass
> manager. The
> issue is that you'd like to be able to use SCEV in a
> module/CGSCC-level
> pass to infer the attribute on functions with loops, and this
> cannot be
> done directly until we have the new pass manager.

Interesting. Could you give an example why knowing a function will
halt is
essential for the heap-to-stack conversion?

The key situation you need to establish in order to do heap-to-stack conversion, is that you can see the calls to free (or realloc, etc.) along all control-flow paths. If you can, then you can perform the conversion (because any additional calls to free that you can't observe would lead to a double free, and thus undefined behavior). Thus, if we have this situation:

void bar(int *a);
void foo() {
  int *a = (int*) malloc(sizeof(int)*40);
  bar(a);
  free(a);
}

we can perform heap-to-stack conversion iff we know that bar(int *) always returns normally. If it never returns (perhaps by looping indefinitely) then it might capture the pointer, pass it off to some other thread, and that other thread might call free() (or it might just call free() itself before looping indefinitely). In short, we need to know whether the call to free() after the call to bar() is dead. If we know that it is reached, then we can perform heap-to-stack conversion. Also worth noting is that because we unconditionally free(a) after the call to bar(a), it would not be legal for bar(a) to call realloc on a (because if realloc did reallocate the buffer we'd end up freeing it twice when bar(a) did eventually return).

It's definitely an
optimization
we should be doing (although, correct me if I'm wrong, LLVM already
has some
form of this for very small mallocs, no?)

Not that I recall, although we will remove unused mallocs (those that are immediately freed).

Thanks again,
Hal

Interesting. Could you give an example why knowing a function will
halt is
essential for the heap-to-stack conversion?

The key situation you need to establish in order to do heap-to-stack conversion, is that you can see the calls to free (or realloc, etc.) along all control-flow paths. If you can, then you can perform the conversion (because any additional calls to free that you can't observe would lead to a double free, and thus undefined behavior). Thus, if we have this situation:

void bar(int *a);
void foo() {
  int *a = (int*) malloc(sizeof(int)*40);
  bar(a);
  free(a);
}

we can perform heap-to-stack conversion iff we know that bar(int *) always returns normally. If it never returns (perhaps by looping indefinitely) then it might capture the pointer, pass it off to some other thread, and that other thread might call free() (or it might just call free() itself before looping indefinitely). In short, we need to know whether the call to free() after the call to bar() is dead. If we know that it is reached, then we can perform heap-to-stack conversion. Also worth noting is that because we unconditionally free(a) after the call to bar(a), it would not be legal for bar(a) to call realloc on a (because if realloc did reallocate the buffer we'd end up freeing it twice when bar(a) did eventually return).

I see, thanks!
Your argument is that knowing that bar returns implies that 'a' cannot be captured or reallocated, otherwise it would be UB. Makes sense, yes.

Thanks,
Nuno

From: "Nuno Lopes" <nunoplopes@sapo.pt>
To: "Hal Finkel" <hfinkel@anl.gov>
Cc: "LLVM Developers Mailing List" <llvmdev@cs.uiuc.edu>, "Sanjoy Das" <sanjoy@playingwithpointers.com>, "Jeremy
Lakeman" <Jeremy.Lakeman@gmail.com>, nlewycky@google.com
Sent: Tuesday, June 30, 2015 4:30:15 PM
Subject: Re: [LLVMdev] readonly and infinite loops

>> Interesting. Could you give an example why knowing a function
>> will
>> halt is
>> essential for the heap-to-stack conversion?
>
> The key situation you need to establish in order to do
> heap-to-stack
> conversion, is that you can see the calls to free (or realloc,
> etc.) along
> all control-flow paths. If you can, then you can perform the
> conversion
> (because any additional calls to free that you can't observe would
> lead to
> a double free, and thus undefined behavior). Thus, if we have this
> situation:
>
> void bar(int *a);
> void foo() {
> int *a = (int*) malloc(sizeof(int)*40);
> bar(a);
> free(a);
> }
>
> we can perform heap-to-stack conversion iff we know that bar(int *)
> always
> returns normally. If it never returns (perhaps by looping
> indefinitely)
> then it might capture the pointer, pass it off to some other
> thread, and
> that other thread might call free() (or it might just call free()
> itself
> before looping indefinitely). In short, we need to know whether the
> call
> to free() after the call to bar() is dead. If we know that it is
> reached,
> then we can perform heap-to-stack conversion. Also worth noting is
> that
> because we unconditionally free(a) after the call to bar(a), it
> would not
> be legal for bar(a) to call realloc on a (because if realloc did
> reallocate the buffer we'd end up freeing it twice when bar(a) did
> eventually return).

I see, thanks!
Your argument is that knowing that bar returns implies that 'a'
cannot be
captured or reallocated, otherwise it would be UB.

Yes. Technically, it can be captured, it just does not matter if it is captured, but the captured value is 'dead' once you pass the call to free(a).

-Hal