Two labels around one instruction in Codegen

Hi everyone,

In order to have exceptions for non-call instructions (such as sdiv,
load or stores), I'm modifying codegen so that it generates a BeginLabel
and an EndLabel between the "may throwing" instruction. This is what the
codegen of an InvokeInst does.

However, when generating native code, only BeginLabel is generated, and
it is generated after the instruction. I'm not familiar with DAGs in the
codegen library, so here are my 2-cents thoughts why:

1) BeginLabel and EndLabel are generated with:
  DAG.setRoot(DAG.getNode(ISD::LABEL, MVT::Other, getRoot(),
                            DAG.getConstant({Begin|End}Label, MVT::i32)));

This seems to work with InvokeInst instructions, because the root of the
DAG is modified by the instruction. With instructions such as sdiv, the
root is not modified: the instruction only lowers itself to:
DAG.getNode(OpCode, Op1.getValueType(), Op1, Op2)

Which probably makes the codegen think EndLabel and BeginLabel are in
the same place

2) Since there is no ordering between the node for the sdiv instruction
and the labels, the sdiv instruction can be generated anywhere.

These assumptions may be wrong, but it's the best I could come up with ;-).

If someone could correct me and help me found how to correctly generate
two labels between one instruction, that would be great! :slight_smile:

Thanks,
Nicolas

Hi Nicolas,

In order to have exceptions for non-call instructions (such as sdiv,
load or stores), I'm modifying codegen so that it generates a BeginLabel
and an EndLabel between the "may throwing" instruction. This is what the
codegen of an InvokeInst does.

the rule is that all instructions between eh begin labelN and eh end labelN
must unwind to the same landing pad. This is why invokes are bracketed by
such labels. There are also two other cases to consider: (1) potentially
throwing instructions which are not allowed to throw (nounwind), (2) throwing
instructions for which any thrown exception will not be processed in this
function. In case (1) the instruction should have no entry in the final
dwarf exception table, while in case (2) it should have an entry. We don't
handle (1) right now, however the plan is that nounwind calls will also be
bracketed by labels but will have no associated landing pad. As for (2),
the dwarf writer scans all instructions in the function and if it sees a
call that is not bracketed by labels then it generates an appropriate entry
in the exception table (this will of course need to be modified to consider
all throwing instructions - note that this means that "maythrow" markings will
have to exist right to the end of code generation!); it is done this way
because labels inhibit optimizations (we used to bracket all calls with
labels, but stopped doing that because of the optimization problem). I'm
mentioning this because the begin and end labels are not *between* maythrow
instructions, they bracket them.

However, when generating native code, only BeginLabel is generated, and
it is generated after the instruction. I'm not familiar with DAGs in the
codegen library, so here are my 2-cents thoughts why:

1) BeginLabel and EndLabel are generated with:
  DAG.setRoot(DAG.getNode(ISD::LABEL, MVT::Other, getRoot(),
                            DAG.getConstant({Begin|End}Label, MVT::i32)));

This seems to work with InvokeInst instructions, because the root of the
DAG is modified by the instruction. With instructions such as sdiv, the
root is not modified: the instruction only lowers itself to:
DAG.getNode(OpCode, Op1.getValueType(), Op1, Op2)

I think that not creating a new root means that the instruction is allowed
to be re-ordered with respect to other instructions, as long as it occurs
before its uses. Re-ordering is rather dubious for instructions that may
throw, though it's not clear what is acceptable. I think you probably need
a new selection DAG "throw" node which you wrap throwing instructions in, a
bit like a TokenFactor. This throw node would be setup in such a way as to
be bracketable by labels.

Which probably makes the codegen think EndLabel and BeginLabel are in
the same place

In that case I would expect them both to be deleted...

2) Since there is no ordering between the node for the sdiv instruction
and the labels, the sdiv instruction can be generated anywhere.

These assumptions may be wrong, but it's the best I could come up with ;-).

If someone could correct me and help me found how to correctly generate
two labels between one instruction, that would be great! :slight_smile:

Ciao,

Duncan.

Duncan Sands wrote:

Hi Nicolas,

In order to have exceptions for non-call instructions (such as sdiv,
load or stores), I'm modifying codegen so that it generates a BeginLabel
and an EndLabel between the "may throwing" instruction. This is what the
codegen of an InvokeInst does.
    
the rule is that all instructions between eh begin labelN and eh end labelN
must unwind to the same landing pad. This is why invokes are bracketed by
such labels. There are also two other cases to consider: (1) potentially
throwing instructions which are not allowed to throw (nounwind),

What do you mean "not allowed"? Is this decided by the front-end? Or by
an optimization pass (div may throw, but if we have a = b / 5 we not it
won't throw).

(2) throwing
instructions for which any thrown exception will not be processed in this
function.

I'm not sure I understand here.

In case (1) the instruction should have no entry in the final
dwarf exception table, while in case (2) it should have an entry. We don't
handle (1) right now, however the plan is that nounwind calls will also be
bracketed by labels but will have no associated landing pad.

Why would they be bracketed by labels if codegen knows they don't throw?

As for (2),
the dwarf writer scans all instructions in the function and if it sees a
call that is not bracketed by labels then it generates an appropriate entry
in the exception table

Do you mean "that _is_ bracketed by labels" ?

(this will of course need to be modified to consider
all throwing instructions - note that this means that "maythrow" markings will
have to exist right to the end of code generation!); it is done this way
because labels inhibit optimizations (we used to bracket all calls with
labels, but stopped doing that because of the optimization problem). I'm
mentioning this because the begin and end labels are not *between* maythrow
instructions, they bracket them.

Sure, that would be the goal. Which means the labels are not created
between an instruction, but between the instructions of a basic block.
I'll see if this works. My first implementation was between one
instruction because it was very simple to copy the invoke case for
non-calls.

However, when generating native code, only BeginLabel is generated, and
it is generated after the instruction. I'm not familiar with DAGs in the
codegen library, so here are my 2-cents thoughts why:

1) BeginLabel and EndLabel are generated with:
  DAG.setRoot(DAG.getNode(ISD::LABEL, MVT::Other, getRoot(),
                            DAG.getConstant({Begin|End}Label, MVT::i32)));

This seems to work with InvokeInst instructions, because the root of the
DAG is modified by the instruction. With instructions such as sdiv, the
root is not modified: the instruction only lowers itself to:
DAG.getNode(OpCode, Op1.getValueType(), Op1, Op2)
    
I think that not creating a new root means that the instruction is allowed
to be re-ordered with respect to other instructions, as long as it occurs
before its uses. Re-ordering is rather dubious for instructions that may
throw, though it's not clear what is acceptable. I think you probably need
a new selection DAG "throw" node which you wrap throwing instructions in, a
bit like a TokenFactor. This throw node would be setup in such a way as to
be bracketable by labels.

I need to get some LLVM code reading :wink:

Which probably makes the codegen think EndLabel and BeginLabel are in
the same place
    
In that case I would expect them both to be deleted...
  
Only one was deleted. Consider the code:

define i32 @test(i32 %argc) {
entry:
        %tmp2 = sdiv i32 2, %argc to label %continue unwind to
label %unwindblock ; <i32> [#uses=1]

continue:
        ret i32 %tmp2

unwindblock:
        unwind
}

And here is the resulting x86 code (Llabel1 was supposed to be before
the {ctld, idvl} and Llabel2 which was after is not generated)

test:
.Leh_func_begin1:
          
.Llabel4:
        movl $2, %eax
        movl 4(%esp), %ecx
        cltd
        idivl %ecx
          
.Llabel1:
.LBB1_1: # continue
        ret
.LBB1_2: # unwindblock

Thanks Duncan,
Nicolas

Hi everyone,

In order to have exceptions for non-call instructions (such as sdiv,
load or stores), I'm modifying codegen so that it generates a BeginLabel
and an EndLabel between the "may throwing" instruction. This is what the
codegen of an InvokeInst does.

However, when generating native code, only BeginLabel is generated, and
it is generated after the instruction. I'm not familiar with DAGs in the
codegen library, so here are my 2-cents thoughts why:

1) BeginLabel and EndLabel are generated with:
  DAG.setRoot(DAG.getNode(ISD::LABEL, MVT::Other, getRoot(),
                            DAG.getConstant({Begin|End}Label, MVT::i32)));

This seems to work with InvokeInst instructions, because the root of the
DAG is modified by the instruction. With instructions such as sdiv, the
root is not modified: the instruction only lowers itself to:
DAG.getNode(OpCode, Op1.getValueType(), Op1, Op2)

DIV does not produce a chain. So the second LABEL's only operand is the first LABEL. The only ordering that's ensure are between the 2 labels so it's very possible the DIV node will be scheduled after both labels. Actually this scheme doesn't even ensure there won't be anything scheduled between the first label and the DIV node. I am assuming that'd badness.

Which probably makes the codegen think EndLabel and BeginLabel are in
the same place

2) Since there is no ordering between the node for the sdiv instruction
and the labels, the sdiv instruction can be generated anywhere.

Right.

These assumptions may be wrong, but it's the best I could come up with ;-).

If someone could correct me and help me found how to correctly generate
two labels between one instruction, that would be great! :slight_smile:

One way to solve this right now is to use flag value. But that means ISD::LABEL, ISD::{S|U}DIV, ISD::LOAD, ISD::STORE will be marked SDNPOutFlag and SDNPOptInFlag. But that's just yucky. Perhaps we need to add new variants of these nodes and leave the current opcodes as non-faulting. But I am not certain that's a very clean solution either.

Evan

Hi Nicolas,

>> In order to have exceptions for non-call instructions (such as sdiv,
>> load or stores), I'm modifying codegen so that it generates a BeginLabel
>> and an EndLabel between the "may throwing" instruction. This is what the
>> codegen of an InvokeInst does.
>>
>
> the rule is that all instructions between eh begin labelN and eh end labelN
> must unwind to the same landing pad. This is why invokes are bracketed by
> such labels. There are also two other cases to consider: (1) potentially
> throwing instructions which are not allowed to throw (nounwind),

What do you mean "not allowed"? Is this decided by the front-end?

yes, it is decided by the front-end. In C++ some constructs are not allowed
to throw (eg: a destructor that is run while unwinding some other exception)
and must result in a call to terminate. Sometimes this can be implemented
by simply wrapping the construct in a try-catch block that calls terminate
if any exception is thrown. But there are obscure situations in which this
can't be done (I forget why - I can look into it again if you like), in
which case the C++ runtime unwinder takes care of it. However for the
unwinder to do it properly, there need to be special markings in the unwind
tables, saying "this call is not allowed to throw".

Or by
an optimization pass (div may throw, but if we have a = b / 5 we not it
won't throw).

No, that's a different issue.

> (2) throwing
> instructions for which any thrown exception will not be processed in this
> function.

I'm not sure I understand here.

For example:
int f(int n) { return 1/n; }
Here the instruction may throw an exception. But there is no handler for it
in function f. However there may be a handler further up the call stack.

> In case (1) the instruction should have no entry in the final
> dwarf exception table, while in case (2) it should have an entry. We don't
> handle (1) right now, however the plan is that nounwind calls will also be
> bracketed by labels but will have no associated landing pad.

Why would they be bracketed by labels if codegen knows they don't throw?

No, this is when they may throw but they're not allowed to according to the
language semantics, i.e. if they do throw the language runtime wants to know
about it and take special action. The labels and special entry in the exception
table exist to tell the runtime that special action should be taken if an
exception is propagated by an instruction between the labels.

> As for (2),
> the dwarf writer scans all instructions in the function and if it sees a
> call that is not bracketed by labels then it generates an appropriate entry
> in the exception table

Do you mean "that _is_ bracketed by labels" ?

No, that is *not* bracketed by labels. It is a strange feature of C++ exception
handling that any call that has no entry in the exception table is considered to
not be allowed to throw exceptions (see (1) above), and if it does throw an
exception then the runtime will call terminate. That means that all ordinary
calls need to have an entry in the exception table. The result is (since we don't
handle (1) yet) that *all* calls end up with entries in the exception table.
However, in order to avoid putting gazillions of labels everywhere (i.e. around
every call), we only out labels around invokes. If the dwarf writer sees a call
bracketed by labels it understands that this is an invoke; if it sees a call that
is not bracketed by labels it understands that this is an ordinary call, and
generates an appropriate entry in the exception table.

> (this will of course need to be modified to consider
> all throwing instructions - note that this means that "maythrow" markings will
> have to exist right to the end of code generation!); it is done this way
> because labels inhibit optimizations (we used to bracket all calls with
> labels, but stopped doing that because of the optimization problem). I'm
> mentioning this because the begin and end labels are not *between* maythrow
> instructions, they bracket them.
>
>

Sure, that would be the goal. Which means the labels are not created
between an instruction, but between the instructions of a basic block.
I'll see if this works. My first implementation was between one
instruction because it was very simple to copy the invoke case for
non-calls.

If all instructions in a basic block unwind to the same place then it is
indeed enough to put a label at the beginning and end of the block.

>> However, when generating native code, only BeginLabel is generated, and
>> it is generated after the instruction. I'm not familiar with DAGs in the
>> codegen library, so here are my 2-cents thoughts why:
>>
>> 1) BeginLabel and EndLabel are generated with:
>> DAG.setRoot(DAG.getNode(ISD::LABEL, MVT::Other, getRoot(),
>> DAG.getConstant({Begin|End}Label, MVT::i32)));
>>
>> This seems to work with InvokeInst instructions, because the root of the
>> DAG is modified by the instruction. With instructions such as sdiv, the
>> root is not modified: the instruction only lowers itself to:
>> DAG.getNode(OpCode, Op1.getValueType(), Op1, Op2)
>>
>
> I think that not creating a new root means that the instruction is allowed
> to be re-ordered with respect to other instructions, as long as it occurs
> before its uses. Re-ordering is rather dubious for instructions that may
> throw, though it's not clear what is acceptable. I think you probably need
> a new selection DAG "throw" node which you wrap throwing instructions in, a
> bit like a TokenFactor. This throw node would be setup in such a way as to
> be bracketable by labels.
>
>

I need to get some LLVM code reading :wink:

>> Which probably makes the codegen think EndLabel and BeginLabel are in
>> the same place
>>
>
> In that case I would expect them both to be deleted...
>

Only one was deleted. Consider the code:

define i32 @test(i32 %argc) {
entry:
        %tmp2 = sdiv i32 2, %argc to label %continue unwind to
label %unwindblock ; <i32> [#uses=1]

continue:
        ret i32 %tmp2

unwindblock:
        unwind
}

And here is the resulting x86 code (Llabel1 was supposed to be before
the {ctld, idvl} and Llabel2 which was after is not generated)

test:
.Leh_func_begin1:
          
.Llabel4:
        movl $2, %eax
        movl 4(%esp), %ecx
        cltd
        idivl %ecx
          
.Llabel1:
.LBB1_1: # continue
        ret
.LBB1_2: # unwindblock

OK, I may take a look if I can find time (hah!).

Ciao,

Duncan.

Hi Evan,

Evan Cheng wrote:

One way to solve this right now is to use flag value. But that means
ISD::LABEL, ISD::{S|U}DIV, ISD::LOAD, ISD::STORE will be marked
SDNPOutFlag and SDNPOptInFlag. But that's just yucky. Perhaps we need
to add new variants of these nodes and leave the current opcodes as
non-faulting. But I am not certain that's a very clean solution either.

I think having variants (1) or differentiating ISD::{S|U}DIV from other
binary instructions (2) is what we would like to avoid.

Following what we discussed with Duncan, what if we generate the labels
around a basic block? Is there a way then to ensure that the begin and
end labels will actually bracket the instructions in the block? I've
found that currently it's not the case, but perhaps it can trigger an
easier solution than (1) or (2).

Thanks,
Nicolas

Hi Evan,

Evan Cheng wrote:

One way to solve this right now is to use flag value. But that means
ISD::LABEL, ISD::{S|U}DIV, ISD::LOAD, ISD::STORE will be marked
SDNPOutFlag and SDNPOptInFlag. But that's just yucky. Perhaps we need
to add new variants of these nodes and leave the current opcodes as
non-faulting. But I am not certain that's a very clean solution either.

I think having variants (1) or differentiating ISD::{S|U}DIV from other
binary instructions (2) is what we would like to avoid.

Following what we discussed with Duncan, what if we generate the labels
around a basic block? Is there a way then to ensure that the begin and
end labels will actually bracket the instructions in the block? I've
found that currently it's not the case, but perhaps it can trigger an
easier solution than (1) or (2).

Ok, so it turns out the labels do not have to be just before / after the divide. So we don't have to use the MVT::Flag hackery. However, the second label must be after the divide. I think the solution is to add a trapping version of DIV (and others) and the second label can use its chain value as operand.

Evan

Ok, so it turns out the labels do not have to be just before / after
the divide. So we don't have to use the MVT::Flag hackery. However,
the second label must be after the divide. I think the solution is to
add a trapping version of DIV (and others) and the second label can
use its chain value as operand.

How about a new "trapping" SDNode, which you can wrap trapping sdiv
and other instructions in?

D.

That would complicate instruction selection. Not worth it.

Evan

Hi all

i read the thread on register allocation extensions for better balancing of the
binding of variables to registers, especially regarding their concurrent
availability through a (probably large) number of read ports. A similar problem
would apply for commiting a large number of results (write operands) to the
multi-port register file.

I'm also interested in this problem since i'm developing (actually: i have
developed) the hardware for use in FPGA devices a processor core with similar
capabilities (regarding the multi-input, multi-output requirement) without
adhering to the TTA architectural principles. But it certainly departs from
typical designs (of RISC/CISC hybrids) in a number of ways: for example, there
is no centralized control unit.

The closest work to this problem is probably an PLDI'04 paper, as well as, works
on variable binding for High-Level Synthesis. Here is the reference to the PLDI
paper:

Balancing Register Allocation Across Threads for a Multithreaded Network
Processor
Xiaotong Zhuang and Santosh Pande
PLDI'04

Another take (and quite not a longshot although probably Fernando -- thanks for
the debugger, will try it -- has worked either alone or with NVK on the ILP
formulation) is to write-up constraints for an ILP (integer linear programming)
solution. I wouldn't mind to wait forever (for several minutes that is ^_^) for
a valid register allocation to complete. My application programs range between
50 (the smallest) to about 2K instructions. Basic blocks would be anything from
a few instructions to a few hundreds of instructions.

There exists at least one compiler that works on an ILP and/or dynamic
programming solution for all three problems of backend compilation. It is named
OPTIMIST and lives on a Swedish univ. server. It has been GPL'ed but i don't
know of anyone working on it (besides the developers themselves).

Overall, I think processor developers should give serious thoughts on adopting
LLVM for developing their compilers (however usually not for JIT compilation).
It's probably the only solution that can combine reasonable time frames (1-6
man months) for a basic non-JIT compiler backend, ease of retargeting for
incremental additions along with all the good frontend and infrastructure
stuff.

Kind regards
Nikolaos Kavvadias

Another take (and quite not a longshot although probably Fernando -- thanks for
the debugger, will try it -- has worked either alone or with NVK on the ILP
formulation) is to write-up constraints for an ILP (integer linear programming)
solution. I wouldn't mind to wait forever (for several minutes that is ^_^) for
a valid register allocation to complete. My application programs range between
50 (the smallest) to about 2K instructions. Basic blocks would be anything from
a few instructions to a few hundreds of instructions.

Hi, actually, I have not done anything towards ILP for register allocation, but two of my friends have. Actually, there is a branch-and-bound solver for LLVM that produces very nice results. It was coded by Lang Hames, and I am using it in some tests. If you need the algorithm, I guess Lang can give it to you.

best,

Fernando

Hi Fernando

> Another take (and quite not a longshot although probably Fernando -- thanks
for
> the debugger, will try it -- has worked either alone or with NVK on the ILP
> formulation) is to write-up constraints for an ILP (integer linear
programming)
> solution. I wouldn't mind to wait forever (for several minutes that is ^_^)
for
> a valid register allocation to complete. My application programs range
between
> 50 (the smallest) to about 2K instructions. Basic blocks would be anything
from
> a few instructions to a few hundreds of instructions.

Hi, actually, I have not done anything towards ILP for register
allocation, but two of my friends have. Actually, there is a
branch-and-bound solver for LLVM that produces very nice results. It was
coded by Lang Hames, and I am using it in some tests. If you need the
algorithm, I guess Lang can give it to you.

best,

I'm very glad to hear about the branch-and-bound solver that is in use in your
team. If possible, I would like to try it.

My soft core processor can be configured for up to 256 registers, so for such a
large number of registers, certain things come in mind (and do quite affect
performance) like register rotation etc. But first of all i would like to see
if a good balancing can be obtained with longer allocation times, so the
b-and-b solver sounds good.

Again, thanks for your offer. BTW my own tools (it is a diverse sets of thingys)
can be found here (feel free to download any):

http://electronics.physics.auth.gr/tomeas/en/kavvadias.html

Kind regards
Nikolaos Kavvadias