Motivation for 'select' instruction

Hello,
I'm wondering what was the original motivaton for the 'select' instruction.
Was it for convenience, or for some deep reason. I'm asking because it's
causing me some problems (see below) and I'd like to know I understand the
situation before working those problems around.

I have the following function:

    int logsch(int ih,int nbh)
    {
         if(nbh < 0) { nbh = 0; ih = 10; }
         if(nbh > 22528) { nbh = 22528; ih = 7; }
         return(nbh + ih);
    }

From the C language standpoint, there are 4 paths though this function: each
"if" can be either take or not. In reality, both "if" can't be taked at the
same time -- and that's why I'm trying to automatically figure out working on
LLVM representation. However, the LLVM code:

    int %logsch(int %ih, int %nbh) {
    entry:
             %tmp.1 = setlt int %nbh, 0
             %ih_addr.1 = select bool %tmp.1, int 10, int %ih
             %nbh_addr.1 = select bool %tmp.1, int 0, int %nbh
             %tmp.4 = setgt int %nbh_addr.1, 22528
             %ih_addr.0 = select bool %tmp.4, int 7, int %ih_addr.1
             %nbh_addr.0 = select bool %tmp.4, int 22528, int %nbh_addr.1
             %tmp.8 = add int %nbh_addr.0, %ih_addr.0
             ret int %tmp.8
    }

Has 'select' instructions and a single path, which makes my task harder.

- Volodya

I'm wondering what was the original motivaton for the 'select'
instruction. Was it for convenience, or for some deep reason. I'm
asking because it's causing me some problems (see below) and I'd like to
know I understand the situation before working those problems around.

The select instruction is basically an SSA-form "conditional move", or a
very simple form of predication. Integer codes often have very complex
CFG's which are usually doing simple things. For example, it's not
uncommon to see something like this:

if (blah)
  X = 14;

If the body of the if statement is simple enough, it's usually cheaper to
unconditionally execute the body than it is to execute the branch. If
this is the case, we like to use the select when possible. Once the
control flow is turned into data flow, the optimizers are also able to do
good things eliminating it.

I have the following function:

    int logsch(int ih,int nbh)
    {
         if(nbh < 0) { nbh = 0; ih = 10; }
         if(nbh > 22528) { nbh = 22528; ih = 7; }
         return(nbh + ih);
    }

>From the C language standpoint, there are 4 paths though this function: each
"if" can be either take or not. In reality, both "if" can't be taked at the
same time -- and that's why I'm trying to automatically figure out working on
LLVM representation. However, the LLVM code:

    int %logsch(int %ih, int %nbh) {
    entry:
             %tmp.1 = setlt int %nbh, 0
             %ih_addr.1 = select bool %tmp.1, int 10, int %ih
             %nbh_addr.1 = select bool %tmp.1, int 0, int %nbh
             %tmp.4 = setgt int %nbh_addr.1, 22528
             %ih_addr.0 = select bool %tmp.4, int 7, int %ih_addr.1
             %nbh_addr.0 = select bool %tmp.4, int 22528, int %nbh_addr.1
             %tmp.8 = add int %nbh_addr.0, %ih_addr.0
             ret int %tmp.8
    }

Has 'select' instructions and a single path, which makes my task harder.

What are you trying to do? It looks like there is some correlation in the
branches that could be optimized away, is that what you're going after?
In the very worst case, you can always use the SelectLowering pass to turn
the select instructions into branches, but it would be preferable to find
another way if possible. :slight_smile:

Note that if you put more code (such as a function call) into the body of
the if statements that the selects will go away. I don't know if this is
an option for you.

-Chris

Chris Lattner wrote:

The select instruction is basically an SSA-form "conditional move", or a
very simple form of predication. Integer codes often have very complex
CFG's which are usually doing simple things. For example, it's not
uncommon to see something like this:

if (blah)
  X = 14;

If the body of the if statement is simple enough, it's usually cheaper to
unconditionally execute the body than it is to execute the branch.

Yes, I understand that. Though I though such optimization is usually done
somewhere in backend, that's why I was surprised that it's commonly used in
LLVM.

> "if" can be either take or not. In reality, both "if" can't be taked at
> the same time -- and that's why I'm trying to automatically figure out
> working on LLVM representation. However, the LLVM code:
>
> int %logsch(int %ih, int %nbh) {
> entry:
> %tmp.1 = setlt int %nbh, 0
> %ih_addr.1 = select bool %tmp.1, int 10, int %ih
> %nbh_addr.1 = select bool %tmp.1, int 0, int %nbh
> %tmp.4 = setgt int %nbh_addr.1, 22528
> %ih_addr.0 = select bool %tmp.4, int 7, int %ih_addr.1
> %nbh_addr.0 = select bool %tmp.4, int 22528, int %nbh_addr.1
> %tmp.8 = add int %nbh_addr.0, %ih_addr.0
> ret int %tmp.8
> }
>
> Has 'select' instructions and a single path, which makes my task harder.

What are you trying to do? It looks like there is some correlation in the
branches that could be optimized away, is that what you're going after?

No, not exactly. I need to find the path with the maximum execution time (in
assembler for specific processor). Do to so, I must also find paths which
cannot be executed no matter what input is (like one of the paths in the C
example). I'm trying to do that using LLVM intermediate representation.

In the very worst case, you can always use the SelectLowering pass to turn
the select instructions into branches, but it would be preferable to find
another way if possible. :slight_smile:

In fact, I think this might be enough for me, since for now I'm only going to
evaluate the algorithms for detecting infeasible paths.

Thanks,
Volodya

> > int %logsch(int %ih, int %nbh) {
> > entry:
> > %tmp.1 = setlt int %nbh, 0
> > %ih_addr.1 = select bool %tmp.1, int 10, int %ih
> > %nbh_addr.1 = select bool %tmp.1, int 0, int %nbh
> > %tmp.4 = setgt int %nbh_addr.1, 22528
> > %ih_addr.0 = select bool %tmp.4, int 7, int %ih_addr.1
> > %nbh_addr.0 = select bool %tmp.4, int 22528, int %nbh_addr.1
> > %tmp.8 = add int %nbh_addr.0, %ih_addr.0
> > ret int %tmp.8
> > }
> >
> > Has 'select' instructions and a single path, which makes my task harder.
>
> What are you trying to do? It looks like there is some correlation in the
> branches that could be optimized away, is that what you're going after?

No, not exactly. I need to find the path with the maximum execution time (in
assembler for specific processor). Do to so, I must also find paths which
cannot be executed no matter what input is (like one of the paths in the C
example). I'm trying to do that using LLVM intermediate representation.

In that case, the above is a simple straight-line sequence of code, which
has exactly one path through it. I don't think you should need to break
it up into multiple paths. If the control flow has been flattened like
this, does it matter if some combination of predicates is impossible?

> In the very worst case, you can always use the SelectLowering pass to turn
> the select instructions into branches, but it would be preferable to find
> another way if possible. :slight_smile:

In fact, I think this might be enough for me, since for now I'm only going to
evaluate the algorithms for detecting infeasible paths.

Okay, that works. :slight_smile:

-Chris

Chris Lattner wrote:

> No, not exactly. I need to find the path with the maximum execution time
> (in assembler for specific processor). Do to so, I must also find paths
> which cannot be executed no matter what input is (like one of the paths
> in the C example). I'm trying to do that using LLVM intermediate
> representation.

In that case, the above is a simple straight-line sequence of code, which
has exactly one path through it. I don't think you should need to break
it up into multiple paths. If the control flow has been flattened like
this, does it matter if some combination of predicates is impossible?

Yes, because the target processor does not have any equivivalent of the
'select' statement. Generally speaking, I have to search for paths in
assembler for that processor and, devise a way to map them back into LLVM and
run analysis then, but that's more work (and research). For a start, lowering
selects will do.

- Volodya

Yes, because the target processor does not have any equivivalent of the
'select' statement. Generally speaking, I have to search for paths in
assembler for that processor and, devise a way to map them back into LLVM and
run analysis then, but that's more work (and research). For a start, lowering
selects will do.

Okay, that makes sense.

-Chris