Hi Nadav,

I almost missed your email because you replied only to the list. I understand your argument. I think that my fix addresses a part of it. It says that instructions that do not return should not be removed.

The current implementation of mayReturn is to check the 'noreturn' function attribute. Are you suggesting to add a new attribute that is called 'mustreturn' ? It is as equally difficult to know if a function will halt.

a function would only be 'mustreturn' [*] if you can prove that it must return

in finite time. By default functions would not be 'mustreturn'. Of course it

is not always possible to prove that a function must return even when it does;

in these cases the function would not be marked 'mustreturn', because you only

mark functions that you can prove must return. This is analogous to 'noreturn'

where you only mark a function 'noreturn' if you can prove it doesn't return;

in difficult cases where it doesn't return but you failed to prove it then the

function is not marked 'noreturn'.

The big difference between the mustreturn and noreturn approaches to the

infinite loop problem is that when the noreturn approach gets it wrong then

you get miscompilations, while when the mustreturn approach gets it wrong you

get missed optimizations, not miscompilations. Summary: the MR approach is

always conservatively correct, while the NR rule is not.

Consider the different cases, comparing what happens with the following two

rules for a function (a function which otherwise wouldn't be considered to have

side effects eg because it is readonly and nounwind):

NR rule: the function is considered to have side-effects iff it has the noreturn attribute;

MR rule: the function is considered to have side-effects iff it does not have the mustreturn attribute.

Case 1: a function infinite loops in an obvious way. The function will be

marked NR, and will not be marked MR. Thus both rules say this function has

side effects.

Case 2: a function which obviously does not infinite loop. The function will

not be marked NR, and will be marked MR. Thus both rules say that this

function does not have side effects.

Case 3: a function which always or sometimes infinite loops, but it is too hard

for the optimizers to work this out. Then the function will not be marked NR

(because we failed to prove it is NR, even though it may be), and will not be

marked MR (because we must fail to prove it is MR, since it is not MR). Then

the NR rule will say that the function *does not have side-effects*, which is

wrong, and may lead to misoptimization. The MR rule says the function *does

have side-effects* which is correct.

Case 4: a function which never infinite loops, but it is too hard for the

optimizers to work this out. Then the function will not be marked NR (because

we must fail to prove it is NR, since it is not NR), and will not be marked MR

(because it was too hard to prove). Then the NR says that the function does not

have side-effects, which is correct. The MR rule says that the function does

have side-effects, which is wrong, but only results in missed optimizations,

*not* miscompilations.

Notice how in every case in which the NR rule says the function has side-effects

then the MR rule also says it has side-effects. Thus if we implement the MR

rule then the NR rule you added is not useful. On the other hand, as these

examples show, the NR rule is inadequate because it doesn't always say you have

side-effects when sometimes you have. By contrast, the MR rule is always

conservatively correct.

Ciao, Duncan.

[*] The difference between 'mustreturn' and my original 'finite' is the question

of what you say for functions containing a call to 'exit' or something like

that. Such a function doesn't return so 'mustreturn' would be confusing, on the

other hand it completes in finite time, thus 'finite' is not unreasonable. To

some extent which version is better is a judgement call, but I think the best

approach may fall out naturally once someone tries to implement a pass that

computes these attributes, as probably one of them is simpler to reason about,

but for the moment I don't know which one it is.