How to decide whether a function is executed or not

Hello everyone,

I want to decide whether a function is executed or not. For example (the value of
cond is not determined at compile time):

br i1 %cond, label %if, label %else

if:

call void f()

br label %exit

else:

br label %exit

We could say that function f is control dependent on cond and may not be
executed.

On the other hand:

br i1 %cond, label %if, label %else

if:

call void f()

br label %exit

else:

call void f()

br label %exit

No matter the value of cond is, function f would be executed.

I am wondering whether there exist any algorithm to decide whether function f is
executed or not. Any suggestion or key word are appreciated.

Many thanks

Hello,

I should think that it is undecidable: it looks like a variant of the Halting Problem.

Consider a function which contains an infinite loop: any algorithm which could determine whether that function is called or not would effectively be an algorithm that could determine whether the program containing that function halts or not. Equally, deciding whether the function contains an infinite loop or not is itself an instance of the Halting Problem.

Richard

A more conservative approach, bailing out on any loops or
complication, would scan the BB flow graph in between any two nodes
(between fully dominating / fully dominated) would return { no, maybe,
yes }. I'd guess that it'd return maybe much more often that yes/no
for the interesting cases, limiting the possibilities of this
approach.

Do you have a concrete use in mind? DCE for the "no" cases? Hoisting
for the "yes" cases? Wouldn't you have to track the arguments as well?
That can quickly get out of hand...

cheers,
--renato

(sorry for the long post; hopefully you will find it informative)

There exist microcontrollers with 16 or 32 bytes of registers, and no RAM.
It's enough to run your washing machine, or microwave oven.

This method would be completely impractical even for such a
microcontroller. Let alone anything with RAM. Or I/O.

This upper bound can be established for real computers, or any model of
computation where there is a (computable) limit on the amount of storage
space. A real computer has a finite amount of state (say N bits) in RAM,
registers, etc.; furthermore, each "clock cycle" it deterministically
transitions from one particular configuration of those N bits to another
configuration (ignoring I/O, "random seed" instructions, etc.); there are
only 2^N such configurations. So if it runs for 2^N + 1 cycles, at least
one configuration must have been entered twice. Because the computer is
deterministic, the same path that took it from the repeating configuration
back to itself will now take it back to that configuration again, and
again, and again, so the program is stuck in an infinite loop.

There exist microcontrollers with 16 or 32 bytes of registers, and no RAM.
It's enough to run your washing machine, or microwave oven.

Yup. Tiny ones like that are usually mask-programmed or one-time
programmable, so unfortunately there's no use desoldering them from broken
items and keeping them around to reuse in projects. Just have to throw them
away :frowning:

This method would be completely impractical even for such a
microcontroller. Let alone anything with RAM. Or I/O.

Yes, that's what I said in the next paragraph. The importance of this
observation is not the practicality: it's that the nature of the difficulty
changes radically, from "provably impossible to do, AT ALL" to "we know at
least that there is an obvious extremely slow and naive algorithm for the
general case, but the door is open for improvements".

Just to play devil's advocate against your claim "would be completely
impractical even for such a microcontroller", I'll give you another
situation where you theoretically have to deal with 2^N configurations
(states), but on many kinds of typical practical inputs the result is quite
manageable: the power set construction, where an NFA with N states is
turned into a DFA with (worst case) 2^N states.

-- Sean Silva

Hello all (sorry for the long post),

Sorry for the misleading of the previous mail. In fact what I want to implement is a uniformness checking pass of OpenCL barrier. Here is the description of barrier function:

All work-items in a work-group executing the kernel on a processor must execute this function before any are allowed to continue execution beyond the barrier. This function must be encountered by all work-items in a work-group executing the kernel.
If barrier is inside a conditional statement, then all work-items must enter the conditional if any work-item enters the conditional statement and executes the barrier.
If barrier is inside a loop, all work-items must execute the barrier for each iteration of the loop before any are allowed to continue execution beyond the barrier.

So if we have the following SPIR:

entry:
%tid = tail call i32 @get_global_id(i32 0)
%cmp = icmp sgt i32 %tid, 25
br i1 %cmp, label %if, label %exit

if:

tail call void @barrier(i32 3)

br label %exit

exit:
ret void

The barrier call is thread dependent and violate uniformness. However if we have another SPIR:

entry:
%tid = tail call i32 @get_global_id(i32 0)
%cmp = icmp sgt i32 %tid, 25
br i1 %cmp, label %if, label %else

if:

tail call void @barrier(i32 3)

br label %exit

else:

tail call void @barrier(i32 3)

br label %exit

exit:
ret void

The barrier function would be executed regardless of the value returned by get_global_id so it is thread independent.

I have implement a LLVM pass to detect control dependency between “tail call void @barrier(i32 3)” and “%tid = tail call i32 @get_global_id(i32 0)”. But the pass is unuseful for the second example.

Any suggestion and keyword is welcomed and appreciated :slight_smile:

You might want to look at this paper:
http://www.doc.ic.ac.uk/~afd/homepages/papers/pdfs/2013/ESOP.pdf

It defines a notion of barrier divergence and lays out a scheme for showing
that a program is free from it. There is also an implementation that uses
a theorem prover to show barrier divergence freedom (among other things).

Peter

A little OT nitpick: Maybe for microwave ovens it’s OK, but for clotheswashing machines I’d say it’d be a no-go with 32 bytes of memory. Modern washing machines use brushless, sensorless motors where you need to run a state estimator, a vector generator, and a PID or similar controller. Filters in those often need to keep quite a bit of history around, and I’d be very happy to hear about someone pulling it all off in less than 384 bytes of data RAM for stack and variables.

Cheers, Kuba