'Address of Label and Indirect Branches in LLVM IR' blog post

If you're interested in this new extension, here is some more information with some less-than-obvious aspects of the design:

This feature was added to LLVM by Bob Wilson, Dan Gohman and I to mainline back in November. If you have questions or comments about the post, this is a good thread to discuss them on :slight_smile:

-Chris

Can a label be listed multiple times in indirectbr?

Clang generates this:
foo: ; preds =
%indirectgoto, %indirectgoto, %indirectgoto, %indirectgoto, %indirectgoto
  store i32 1, i32* %retval
  br label %return

indirectbr i8* %indirect.goto.dest, [label %foo, label %foo, label %bar,
label %foo, label %hack, label %foo, label %foo]

For this code taken from the gcc manual:
....
     static const int array = { &&foo - &&foo, &&bar - &&foo,
                                  &&hack - &&foo };
     goto *(&&foo + array[i]);
.....

If I remove &&foo - &&foo from the array, clang still thinks that &&foo
is a possible destination, even if I run some optimizers on it.
Since the argument to goto is an array of constants, it should be
possible for an optimizer to determine the exact list of destinations.

Also the intent of that code is to allow it to go into a readonly
section, however with Clang it only goes to a .data.rel.ro section (with
-fPIC):
        .section .data.rel.ro,"aw",@progbits
        .align 4
foo.array:
        .long (.LBA3_foo_return) - (.LBA3_foo_return)
        .long (.LBA3_foo_bar) - (.LBA3_foo_return)
        .long (.LBA3_foo_hack) - (.LBA3_foo_return)
        .size foo.array, 12

While gcc does put it into a readonly section (with -fPIC):
        .section .rodata
        .align 4
        .type array.1248, @object
        .size array.1248, 12
array.1248:
        .long 0
        .long .L4-.L2
        .long .L5-.L2

Best regards,
--Edwin

Hello, Edwin

Also the intent of that code is to allow it to go into a readonly
section, however with Clang it only goes to a .data.rel.ro section (with
-fPIC):

Sounds like a bug. Fill a PR and assign to me. I will look into it
when I return from vacations.

If you're interested in this new extension, here is some more information with some less-than-obvious aspects of the design:
Address of Label and Indirect Branches in LLVM IR - The LLVM Project Blog

This feature was added to LLVM by Bob Wilson, Dan Gohman and I to mainline back in November. If you have questions or comments about the post, this is a good thread to discuss them on :slight_smile:

Can a label be listed multiple times in indirectbr?

Yes; it's not particularly meaningful, but it's not difficult to
construct a case where the optimizer will introduce such a construct.

Clang generates this:
foo: ; preds =
%indirectgoto, %indirectgoto, %indirectgoto, %indirectgoto, %indirectgoto
store i32 1, i32* %retval
br label %return

indirectbr i8* %indirect.goto.dest, [label %foo, label %foo, label %bar,
label %foo, label %hack, label %foo, label %foo]

For this code taken from the gcc manual:
....
static const int array = { &&foo - &&foo, &&bar - &&foo,
&&hack - &&foo };
goto *(&&foo + array[i]);
.....

If I remove &&foo - &&foo from the array, clang still thinks that &&foo
is a possible destination, even if I run some optimizers on it.
Since the argument to goto is an array of constants, it should be
possible for an optimizer to determine the exact list of destinations.

Missed optimization, I guess... put it into lib/Target/README.txt if
you think it's an interesting case to try to optimize. (It doesn't
strike me as particularly interesting because anyone using indirect
gotos is going to coding carefully anyway.)

Also the intent of that code is to allow it to go into a readonly
section, however with Clang it only goes to a .data.rel.ro section (with
-fPIC):

Another missed optimization; this one seems pretty important, though.

-Eli

  

If you're interested in this new extension, here is some more information with some less-than-obvious aspects of the design:
Address of Label and Indirect Branches in LLVM IR - The LLVM Project Blog

This feature was added to LLVM by Bob Wilson, Dan Gohman and I to mainline back in November. If you have questions or comments about the post, this is a good thread to discuss them on :slight_smile:

Can a label be listed multiple times in indirectbr?
    
Yes; it's not particularly meaningful, but it's not difficult to
construct a case where the optimizer will introduce such a construct.
  
Ok.

Clang generates this:
foo: ; preds =
%indirectgoto, %indirectgoto, %indirectgoto, %indirectgoto, %indirectgoto
store i32 1, i32* %retval
br label %return

indirectbr i8* %indirect.goto.dest, [label %foo, label %foo, label %bar,
label %foo, label %hack, label %foo, label %foo]

For this code taken from the gcc manual:
....
    static const int array = { &&foo - &&foo, &&bar - &&foo,
                                 &&hack - &&foo };
    goto *(&&foo + array[i]);
.....

If I remove &&foo - &&foo from the array, clang still thinks that &&foo
is a possible destination, even if I run some optimizers on it.
Since the argument to goto is an array of constants, it should be
possible for an optimizer to determine the exact list of destinations.
    
Missed optimization, I guess... put it into lib/Target/README.txt if
you think it's an interesting case to try to optimize. (It doesn't
strike me as particularly interesting because anyone using indirect
gotos is going to coding carefully anyway.)
  
If the code generator isn't confused by the multiple destinations then
its fine.
I can't think of a situation where the presence or the lack of that one
extra edge would matter.

Also the intent of that code is to allow it to go into a readonly
section, however with Clang it only goes to a .data.rel.ro section (with
-fPIC):
    
Another missed optimization; this one seems pretty important, though.
  
Hello, Edwin

Also the intent of that code is to allow it to go into a readonly
section, however with Clang it only goes to a .data.rel.ro section (with
-fPIC):
    

Sounds like a bug. Fill a PR and assign to me. I will look into it
when I return from vacations.

Done, PR5929.

Best regards,
--Edwin

If you're interested in this new extension, here is some more information with some less-than-obvious aspects of the design:
Address of Label and Indirect Branches in LLVM IR - The LLVM Project Blog

This feature was added to LLVM by Bob Wilson, Dan Gohman and I to mainline back in November. If you have questions or comments about the post, this is a good thread to discuss them on :slight_smile:

Can a label be listed multiple times in indirectbr?

Yep.

Also the intent of that code is to allow it to go into a readonly
section, however with Clang it only goes to a .data.rel.ro section (with
-fPIC):

Nice catch, fixed in r92450!

-Chris

My only comment is that I tripped over inadvertent version skew with the
docs while tryin to code exactly the case discussed. I am building a
small lexer which I naturally wanted to implement with a jump table to
labels corresponding to automaton states, and when I couldn't get it to
work I finally fell back on the switch solution too. But it offends my
moral sensibilities. :slight_smile:

Since I'm a newcomer to LLVM I can't comment on the implementation of
the extension at all, but will make a wild guess that all the "creative"
uses of label addresses are going to be like the Linux case you
describe--it sounds like some sort of debugging hackery. If they
interfere with optimizations, perhaps you can support them only with
optimizations shut off (at least for that bit of code), or better just
tell the relevant optimizer stages to leave that code alone when weird
usages are detected. I for one wouldn't expect you to kill yourself so
my debugging code could be aggressively optimized.

Now that I've said that I'll no doubt think of some other use I *would*
like optimized. It would have to be pretty strange, though.

Dustin